Re: AW: Howto verify that update is "in-place"

2017-10-18 Thread alessandro.benedetti
According to the concept of immutability that should drive Lucene segmenting
approach, I think Emir observation sounds correct.

Being docValues a column based data structure, stored on segments i guess
when an in place update happens it does just a re-index of just that field.
This means we need to write a new segment containing the information and
potentially merge it if it is flushed to the disk.



-
---
Alessandro Benedetti
Search Consultant, R&D Software Engineer, Director
Sease Ltd. - www.sease.io
--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: AW: Howto verify that update is "in-place"

2017-10-18 Thread Emir Arnautović
Hi,
Not claiming that is the case here, but I think I've read some comments (I
think in some Jira) that suggest that inplace updates are not as cheap as
one might think and that does not reindex dox but dies rewrite of doc
values for the segment. Did not look at code, but if someone is familiar
with this, can one please jump in and comment hiw cheap in place updates
are.

Thanks,
Emir

On Oct 17, 2017 2:11 PM, "James"  wrote:

> I found a solution which works for me:
>
> Add a document with very little tokenized text and write down QTime (for
> me: 5ms)
> Add another document with very much text (I used about 1MB of Lorem Ipsum
> sample text) and write down QTime (for me: 70ms).
> Perform an update operation on document 2 which you want to test whether
> it is "in-place" and compare QTime.
> For me it was again 70ms. So I assume that my operation did re-index the
> whole document and was thus not an in-place update.
>
>
> -Ursprüngliche Nachricht-
> Von: Amrit Sarkar [mailto:sarkaramr...@gmail.com]
> Gesendet: Dienstag, 17. Oktober 2017 12:43
> An: solr-user@lucene.apache.org
> Betreff: Re: Howto verify that update is "in-place"
>
> James,
>
> @Amrit: Are you saying that the _version_ field should not change when
> > performing an atomic update operation?
>
>
> It should change. a new version will be allotted to the document. I am not
> that sure about in-place updates, probably a test run will verify that.
>
> Amrit Sarkar
> Search Engineer
> Lucidworks, Inc.
> 415-589-9269
> www.lucidworks.com
> Twitter http://twitter.com/lucidworks
> LinkedIn: https://www.linkedin.com/in/sarkaramrit2
>
> On Tue, Oct 17, 2017 at 4:06 PM, James  wrote:
>
> > Hi Emir and Amrit, thanks for your reponses!
> >
> > @Emir: Nice idea but after changing any document in any way and after
> > committing the changes, all Doc counter (Num, Max, Deleted) are still
> > the same, only thing that changes is the Version (increases by steps of
> 2) .
> >
> > @Amrit: Are you saying that the _version_ field should not change when
> > performing an atomic update operation?
> >
> > Thanks
> > James
> >
> >
> > -Ursprüngliche Nachricht-
> > Von: Amrit Sarkar [mailto:sarkaramr...@gmail.com]
> > Gesendet: Dienstag, 17. Oktober 2017 11:35
> > An: solr-user@lucene.apache.org
> > Betreff: Re: Howto verify that update is "in-place"
> >
> > Hi James,
> >
> > As for each update you are doing via atomic operation contains the
> > "id" / "uniqueKey". Comparing the "_version_" field value for one of
> > them would be fine for a batch. Rest, Emir has list them out.
> >
> > Amrit Sarkar
> > Search Engineer
> > Lucidworks, Inc.
> > 415-589-9269
> > www.lucidworks.com
> > Twitter http://twitter.com/lucidworks
> > LinkedIn: https://www.linkedin.com/in/sarkaramrit2
> >
> > On Tue, Oct 17, 2017 at 2:47 PM, Emir Arnautović <
> > emir.arnauto...@sematext.com> wrote:
> >
> > > Hi James,
> > > I did not try, but checking max and num doc might give you info if
> > > update was in-place or atomic - atomic is reindexing of existing doc
> > > so the old doc will be deleted. In-place update should just update
> > > doc values of existing doc so number of deleted docs should not change.
> > >
> > > HTH,
> > > Emir
> > > --
> > > Monitoring - Log Management - Alerting - Anomaly Detection Solr &
> > > Elasticsearch Consulting Support Training - http://sematext.com/
> > >
> > >
> > >
> > > > On 17 Oct 2017, at 09:57, James  wrote:
> > > >
> > > > I am using Solr 6.6 and carefully read the documentation about
> > > > atomic and in-place updates. I am pretty sure that everything is
> > > > set up as it
> > > should.
> > > >
> > > >
> > > >
> > > > But how can I make certain that a simple update command actually
> > > performs an
> > > > in-place update without internally re-indexing all other fields?
> > > >
> > > >
> > > >
> > > > I am issuing this command to my server:
> > > >
> > > > (I am using implicit document routing, so I need the "Shard"
> > > > parameter.)
> > > >
> > > >
> > > >
> > > > {
> > > >
> > > > "ID":1133,
> > > >
> > > > "Property_2":{"set":124},
> > > >
> > > > "Shard":"FirstShard"
> > > >
> > > > }
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > The log outputs:
> > > >
> > > >
> > > >
> > > > 2017-10-17 07:39:18.701 INFO  (qtp1937348256-643) [c:MyCollection
> > > > s:FirstShard r:core_node27 x:MyCollection_FirstShard_replica1]
> > > > o.a.s.u.p.LogUpdateProcessorFactory
> > > > [MyCollection_FirstShard_replica1]
> > > > webapp=/solr path=/update
> > > > params={commitWithin=1000&boost=1.0&overwrite=true&wt=
> > > json&_=1508221142230}{
> > > > add=[1133 (1581489542869811200)]} 0 1
> > > >
> > > > 2017-10-17 07:39:19.703 INFO  (commitScheduler-283-thread-1)
> > > [c:MyCollection
> > > > s:FirstShard r:core_node27 x:MyCollection_FirstShard_replica1]
> > > > o.a.s.u.DirectUpdateHandler2 start
> > > > commit{,optimize=false,openSearcher=false,waitSearcher=true,
> > > expungeDeletes=f
> > > > alse,softCommit=true,prepareCommit=false}
> > > >

AW: Howto verify that update is "in-place"

2017-10-17 Thread James
I found a solution which works for me:

Add a document with very little tokenized text and write down QTime (for me: 
5ms)
Add another document with very much text (I used about 1MB of Lorem Ipsum 
sample text) and write down QTime (for me: 70ms).
Perform an update operation on document 2 which you want to test whether it is 
"in-place" and compare QTime.
For me it was again 70ms. So I assume that my operation did re-index the whole 
document and was thus not an in-place update.


-Ursprüngliche Nachricht-
Von: Amrit Sarkar [mailto:sarkaramr...@gmail.com] 
Gesendet: Dienstag, 17. Oktober 2017 12:43
An: solr-user@lucene.apache.org
Betreff: Re: Howto verify that update is "in-place"

James,

@Amrit: Are you saying that the _version_ field should not change when
> performing an atomic update operation?


It should change. a new version will be allotted to the document. I am not that 
sure about in-place updates, probably a test run will verify that.

Amrit Sarkar
Search Engineer
Lucidworks, Inc.
415-589-9269
www.lucidworks.com
Twitter http://twitter.com/lucidworks
LinkedIn: https://www.linkedin.com/in/sarkaramrit2

On Tue, Oct 17, 2017 at 4:06 PM, James  wrote:

> Hi Emir and Amrit, thanks for your reponses!
>
> @Emir: Nice idea but after changing any document in any way and after 
> committing the changes, all Doc counter (Num, Max, Deleted) are still 
> the same, only thing that changes is the Version (increases by steps of 2) .
>
> @Amrit: Are you saying that the _version_ field should not change when 
> performing an atomic update operation?
>
> Thanks
> James
>
>
> -Ursprüngliche Nachricht-
> Von: Amrit Sarkar [mailto:sarkaramr...@gmail.com]
> Gesendet: Dienstag, 17. Oktober 2017 11:35
> An: solr-user@lucene.apache.org
> Betreff: Re: Howto verify that update is "in-place"
>
> Hi James,
>
> As for each update you are doing via atomic operation contains the 
> "id" / "uniqueKey". Comparing the "_version_" field value for one of 
> them would be fine for a batch. Rest, Emir has list them out.
>
> Amrit Sarkar
> Search Engineer
> Lucidworks, Inc.
> 415-589-9269
> www.lucidworks.com
> Twitter http://twitter.com/lucidworks
> LinkedIn: https://www.linkedin.com/in/sarkaramrit2
>
> On Tue, Oct 17, 2017 at 2:47 PM, Emir Arnautović < 
> emir.arnauto...@sematext.com> wrote:
>
> > Hi James,
> > I did not try, but checking max and num doc might give you info if 
> > update was in-place or atomic - atomic is reindexing of existing doc 
> > so the old doc will be deleted. In-place update should just update 
> > doc values of existing doc so number of deleted docs should not change.
> >
> > HTH,
> > Emir
> > --
> > Monitoring - Log Management - Alerting - Anomaly Detection Solr & 
> > Elasticsearch Consulting Support Training - http://sematext.com/
> >
> >
> >
> > > On 17 Oct 2017, at 09:57, James  wrote:
> > >
> > > I am using Solr 6.6 and carefully read the documentation about 
> > > atomic and in-place updates. I am pretty sure that everything is 
> > > set up as it
> > should.
> > >
> > >
> > >
> > > But how can I make certain that a simple update command actually
> > performs an
> > > in-place update without internally re-indexing all other fields?
> > >
> > >
> > >
> > > I am issuing this command to my server:
> > >
> > > (I am using implicit document routing, so I need the "Shard"
> > > parameter.)
> > >
> > >
> > >
> > > {
> > >
> > > "ID":1133,
> > >
> > > "Property_2":{"set":124},
> > >
> > > "Shard":"FirstShard"
> > >
> > > }
> > >
> > >
> > >
> > >
> > >
> > > The log outputs:
> > >
> > >
> > >
> > > 2017-10-17 07:39:18.701 INFO  (qtp1937348256-643) [c:MyCollection 
> > > s:FirstShard r:core_node27 x:MyCollection_FirstShard_replica1]
> > > o.a.s.u.p.LogUpdateProcessorFactory
> > > [MyCollection_FirstShard_replica1]
> > > webapp=/solr path=/update
> > > params={commitWithin=1000&boost=1.0&overwrite=true&wt=
> > json&_=1508221142230}{
> > > add=[1133 (1581489542869811200)]} 0 1
> > >
> > > 2017-10-17 07:39:19.703 INFO  (commitScheduler-283-thread-1)
> > [c:MyCollection
> > > s:FirstShard r:core_node27 x:MyCollection_FirstShard_replica1]
> > > o.a.s.u.DirectUpdateHandler2 start 
> > > commit{,optimize=false,openSearcher=false,waitSearcher=true,
> > expungeDeletes=f
> > > alse,softCommit=true,prepareCommit=false}
> > >
> > > 2017-10-17 07:39:19.703 INFO  (commitScheduler-283-thread-1)
> > [c:MyCollection
> > > s:FirstShard r:core_node27 x:MyCollection_FirstShard_replica1]
> > > o.a.s.s.SolrIndexSearcher Opening
> > > [Searcher@32d539b4[MyCollection_FirstShard_replica1] main]
> > >
> > > 2017-10-17 07:39:19.703 INFO  (commitScheduler-283-thread-1)
> > [c:MyCollection
> > > s:FirstShard r:core_node27 x:MyCollection_FirstShard_replica1]
> > > o.a.s.u.DirectUpdateHandler2 end_commit_flush
> > >
> > > 2017-10-17 07:39:19.703 INFO
> > > (searcherExecutor-268-thread-1-processing-n:192.168.117.142:8983_s
> > > ol
> > > r
> > > x:MyCollection_FirstShard_replica1 s:FirstShard c:MyCollection
> > > r:core_nod

AW: Howto verify that update is "in-place"

2017-10-17 Thread Julian Ohrt
Hi Emir nad Amrit,

@Emir: Nice idea but after changing any document in any way and after 
committing the changes, all Doc counter (Num, Max, Deleted) are still the same, 
only thing that changes is the Version (increases by steps of 2) .

@Amrit: Are you saying that the _version_ field should not change when 
performing an atomic update operation?

Thanks
James

-Ursprüngliche Nachricht-
Von: Amrit Sarkar [mailto:sarkaramr...@gmail.com] 
Gesendet: Dienstag, 17. Oktober 2017 11:35
An: solr-user@lucene.apache.org
Betreff: Re: Howto verify that update is "in-place"

Hi James,

As for each update you are doing via atomic operation contains the "id" / 
"uniqueKey". Comparing the "_version_" field value for one of them would be 
fine for a batch. Rest, Emir has list them out.

Amrit Sarkar
Search Engineer
Lucidworks, Inc.
415-589-9269
www.lucidworks.com
Twitter http://twitter.com/lucidworks
LinkedIn: https://www.linkedin.com/in/sarkaramrit2

On Tue, Oct 17, 2017 at 2:47 PM, Emir Arnautović < 
emir.arnauto...@sematext.com> wrote:

> Hi James,
> I did not try, but checking max and num doc might give you info if 
> update was in-place or atomic - atomic is reindexing of existing doc 
> so the old doc will be deleted. In-place update should just update doc 
> values of existing doc so number of deleted docs should not change.
>
> HTH,
> Emir
> --
> Monitoring - Log Management - Alerting - Anomaly Detection Solr & 
> Elasticsearch Consulting Support Training - http://sematext.com/
>
>
>
> > On 17 Oct 2017, at 09:57, James  wrote:
> >
> > I am using Solr 6.6 and carefully read the documentation about 
> > atomic and in-place updates. I am pretty sure that everything is set 
> > up as it
> should.
> >
> >
> >
> > But how can I make certain that a simple update command actually
> performs an
> > in-place update without internally re-indexing all other fields?
> >
> >
> >
> > I am issuing this command to my server:
> >
> > (I am using implicit document routing, so I need the "Shard" 
> > parameter.)
> >
> >
> >
> > {
> >
> > "ID":1133,
> >
> > "Property_2":{"set":124},
> >
> > "Shard":"FirstShard"
> >
> > }
> >
> >
> >
> >
> >
> > The log outputs:
> >
> >
> >
> > 2017-10-17 07:39:18.701 INFO  (qtp1937348256-643) [c:MyCollection 
> > s:FirstShard r:core_node27 x:MyCollection_FirstShard_replica1]
> > o.a.s.u.p.LogUpdateProcessorFactory 
> > [MyCollection_FirstShard_replica1]
> > webapp=/solr path=/update
> > params={commitWithin=1000&boost=1.0&overwrite=true&wt=
> json&_=1508221142230}{
> > add=[1133 (1581489542869811200)]} 0 1
> >
> > 2017-10-17 07:39:19.703 INFO  (commitScheduler-283-thread-1)
> [c:MyCollection
> > s:FirstShard r:core_node27 x:MyCollection_FirstShard_replica1]
> > o.a.s.u.DirectUpdateHandler2 start
> > commit{,optimize=false,openSearcher=false,waitSearcher=true,
> expungeDeletes=f
> > alse,softCommit=true,prepareCommit=false}
> >
> > 2017-10-17 07:39:19.703 INFO  (commitScheduler-283-thread-1)
> [c:MyCollection
> > s:FirstShard r:core_node27 x:MyCollection_FirstShard_replica1]
> > o.a.s.s.SolrIndexSearcher Opening
> > [Searcher@32d539b4[MyCollection_FirstShard_replica1] main]
> >
> > 2017-10-17 07:39:19.703 INFO  (commitScheduler-283-thread-1)
> [c:MyCollection
> > s:FirstShard r:core_node27 x:MyCollection_FirstShard_replica1]
> > o.a.s.u.DirectUpdateHandler2 end_commit_flush
> >
> > 2017-10-17 07:39:19.703 INFO
> > (searcherExecutor-268-thread-1-processing-n:192.168.117.142:8983_sol
> > r
> > x:MyCollection_FirstShard_replica1 s:FirstShard c:MyCollection
> > r:core_node27) [c:MyCollection s:FirstShard r:core_node27 
> > x:MyCollection_FirstShard_replica1] o.a.s.c.QuerySenderListener 
> > QuerySenderListener sending requests to 
> > Searcher@32d539b4[MyCollection_FirstShard_replica1]
> > main{ExitableDirectoryReader(UninvertingDirectoryReader(
> Uninverting(_i(6.6.0
> > ):C5011/1) Uninverting(_j(6.6.0):C478) Uninverting(_k(6.6.0):C345)
> > Uninverting(_l(6.6.0):C4182) Uninverting(_m(6.6.0):C317)
> > Uninverting(_n(6.6.0):C399) Uninverting(_q(6.6.0):C1)))}
> >
> > 2017-10-17 07:39:19.703 INFO
> > (searcherExecutor-268-thread-1-processing-n:192.168.117.142:8983_sol
> > r
> > x:MyCollection_FirstShard_replica1 s:FirstShard c:MyCollection
> > r:core_node27) [c:MyCollection s:FirstShard r:core_node27 
> > x:MyCollection_FirstShard_replica1] o.a.s.c.QuerySenderListener 
> > QuerySenderListener done.
> >
> > 2017-10-17 07:39:19.703 INFO
> > (searcherExecutor-268-thread-1-processing-n:192.168.117.142:8983_sol
> > r
> > x:MyCollection_FirstShard_replica1 s:FirstShard c:MyCollection
> > r:core_node27) [c:MyCollection s:FirstShard r:core_node27 
> > x:MyCollection_FirstShard_replica1] o.a.s.c.SolrCore 
> > [MyCollection_FirstShard_replica1] Registered new searcher 
> > Searcher@32d539b4[MyCollection_FirstShard_replica1]
> > main{ExitableDirectoryReader(UninvertingDirectoryReader(
> Uninverting(_i(6.6.0
> > ):C5011/1) Uninverting(_j(6.6.0):C478) Uninverting(_k(6.6.0):C345)
> > Uninverting(_l(6.6.0):C4182) U

AW: Howto verify that update is "in-place"

2017-10-17 Thread James
Hi Emir and Amrit, thanks for your reponses!

@Emir: Nice idea but after changing any document in any way and after 
committing the changes, all Doc counter (Num, Max, Deleted) are still the same, 
only thing that changes is the Version (increases by steps of 2) .

@Amrit: Are you saying that the _version_ field should not change when 
performing an atomic update operation?

Thanks
James


-Ursprüngliche Nachricht-
Von: Amrit Sarkar [mailto:sarkaramr...@gmail.com] 
Gesendet: Dienstag, 17. Oktober 2017 11:35
An: solr-user@lucene.apache.org
Betreff: Re: Howto verify that update is "in-place"

Hi James,

As for each update you are doing via atomic operation contains the "id" / 
"uniqueKey". Comparing the "_version_" field value for one of them would be 
fine for a batch. Rest, Emir has list them out.

Amrit Sarkar
Search Engineer
Lucidworks, Inc.
415-589-9269
www.lucidworks.com
Twitter http://twitter.com/lucidworks
LinkedIn: https://www.linkedin.com/in/sarkaramrit2

On Tue, Oct 17, 2017 at 2:47 PM, Emir Arnautović < 
emir.arnauto...@sematext.com> wrote:

> Hi James,
> I did not try, but checking max and num doc might give you info if 
> update was in-place or atomic - atomic is reindexing of existing doc 
> so the old doc will be deleted. In-place update should just update doc 
> values of existing doc so number of deleted docs should not change.
>
> HTH,
> Emir
> --
> Monitoring - Log Management - Alerting - Anomaly Detection Solr & 
> Elasticsearch Consulting Support Training - http://sematext.com/
>
>
>
> > On 17 Oct 2017, at 09:57, James  wrote:
> >
> > I am using Solr 6.6 and carefully read the documentation about 
> > atomic and in-place updates. I am pretty sure that everything is set 
> > up as it
> should.
> >
> >
> >
> > But how can I make certain that a simple update command actually
> performs an
> > in-place update without internally re-indexing all other fields?
> >
> >
> >
> > I am issuing this command to my server:
> >
> > (I am using implicit document routing, so I need the "Shard" 
> > parameter.)
> >
> >
> >
> > {
> >
> > "ID":1133,
> >
> > "Property_2":{"set":124},
> >
> > "Shard":"FirstShard"
> >
> > }
> >
> >
> >
> >
> >
> > The log outputs:
> >
> >
> >
> > 2017-10-17 07:39:18.701 INFO  (qtp1937348256-643) [c:MyCollection 
> > s:FirstShard r:core_node27 x:MyCollection_FirstShard_replica1]
> > o.a.s.u.p.LogUpdateProcessorFactory 
> > [MyCollection_FirstShard_replica1]
> > webapp=/solr path=/update
> > params={commitWithin=1000&boost=1.0&overwrite=true&wt=
> json&_=1508221142230}{
> > add=[1133 (1581489542869811200)]} 0 1
> >
> > 2017-10-17 07:39:19.703 INFO  (commitScheduler-283-thread-1)
> [c:MyCollection
> > s:FirstShard r:core_node27 x:MyCollection_FirstShard_replica1]
> > o.a.s.u.DirectUpdateHandler2 start
> > commit{,optimize=false,openSearcher=false,waitSearcher=true,
> expungeDeletes=f
> > alse,softCommit=true,prepareCommit=false}
> >
> > 2017-10-17 07:39:19.703 INFO  (commitScheduler-283-thread-1)
> [c:MyCollection
> > s:FirstShard r:core_node27 x:MyCollection_FirstShard_replica1]
> > o.a.s.s.SolrIndexSearcher Opening
> > [Searcher@32d539b4[MyCollection_FirstShard_replica1] main]
> >
> > 2017-10-17 07:39:19.703 INFO  (commitScheduler-283-thread-1)
> [c:MyCollection
> > s:FirstShard r:core_node27 x:MyCollection_FirstShard_replica1]
> > o.a.s.u.DirectUpdateHandler2 end_commit_flush
> >
> > 2017-10-17 07:39:19.703 INFO
> > (searcherExecutor-268-thread-1-processing-n:192.168.117.142:8983_sol
> > r
> > x:MyCollection_FirstShard_replica1 s:FirstShard c:MyCollection
> > r:core_node27) [c:MyCollection s:FirstShard r:core_node27 
> > x:MyCollection_FirstShard_replica1] o.a.s.c.QuerySenderListener 
> > QuerySenderListener sending requests to 
> > Searcher@32d539b4[MyCollection_FirstShard_replica1]
> > main{ExitableDirectoryReader(UninvertingDirectoryReader(
> Uninverting(_i(6.6.0
> > ):C5011/1) Uninverting(_j(6.6.0):C478) Uninverting(_k(6.6.0):C345)
> > Uninverting(_l(6.6.0):C4182) Uninverting(_m(6.6.0):C317)
> > Uninverting(_n(6.6.0):C399) Uninverting(_q(6.6.0):C1)))}
> >
> > 2017-10-17 07:39:19.703 INFO
> > (searcherExecutor-268-thread-1-processing-n:192.168.117.142:8983_sol
> > r
> > x:MyCollection_FirstShard_replica1 s:FirstShard c:MyCollection
> > r:core_node27) [c:MyCollection s:FirstShard r:core_node27 
> > x:MyCollection_FirstShard_replica1] o.a.s.c.QuerySenderListener 
> > QuerySenderListener done.
> >
> > 2017-10-17 07:39:19.703 INFO
> > (searcherExecutor-268-thread-1-processing-n:192.168.117.142:8983_sol
> > r
> > x:MyCollection_FirstShard_replica1 s:FirstShard c:MyCollection
> > r:core_node27) [c:MyCollection s:FirstShard r:core_node27 
> > x:MyCollection_FirstShard_replica1] o.a.s.c.SolrCore 
> > [MyCollection_FirstShard_replica1] Registered new searcher 
> > Searcher@32d539b4[MyCollection_FirstShard_replica1]
> > main{ExitableDirectoryReader(UninvertingDirectoryReader(
> Uninverting(_i(6.6.0
> > ):C5011/1) Uninverting(_j(6.6.0):C478) Uninverting(_k(6.6.0):C345)
> > Uni