Re: knapsack use case

2015-01-19 Thread Matteo Moci
…just a small update in case anyone was wondering:

I completed an elasticsearch-knapsack export to file from a 0.90.7 (with
the plugin built with 0.90.7 dependencies) that was correctly re-imported
in a 1.4.2 instance with the latest plugin version, including settings and
aliases.

I just checked out the source from github and changed 2-3 lines due to API
changes and assembled the plugin to be installed on a 0.90.7 instance.

Just wanted to say thanks!

Best,
Matteo


On Thu, Jul 31, 2014 at 11:33 AM, joergpra...@gmail.com <
joergpra...@gmail.com> wrote:

> Snapshot/restore is always recommended, but is a 1.0 feature. This is a
> standard API of ES and well supported by the ES team. With that, you can
> handle all kinds of index data safely on a binary level, fully and
> incrementally.
>
> Knapsack plugin is for document export/import only. I wrote it to
> transport _source data harvested over a long time period from a  < 1.0
> system to a production system. It works on _source or stored fields only.
> It uses search/query and bulk indexing API without snapshots, so it is up
> to the admin to stop index writes while knapsack runs. There is also a
> lookup of index settings and mappings, this information is also included in
> the export archive file, and re-applied at import time. But, there is no
> check if these settings/mappings can be applied on the target successfully,
> this is left to the admin to prepare plugins, analyzers, etc. Aliases are
> not transported but this is a good idea for improvement.
>
> Currently, knapsack plugin does not work on ES 1.3 but I am progressing to
> implement this. I am adding a Java-level API. Currently it is REST only.
>
> Jörg
>
>
> On Thu, Jul 31, 2014 at 11:05 AM, Matteo Moci  wrote:
>
>> Hi All,
>> I have some questions about the knapsack plugin [1].
>>
>> My idea to use the tool to do a backup to a file, starting from a 0.90.x
>> instance and then restore it on a different 1.2.x or 1.3.x instance. I see
>> it can't be done directly, copying to a local/remote cluster.
>>
>> Would it work doing an intermediate step with a file?
>> Or the backup still has metadata about the es version it was generated
>> from, making it impossible?
>>
>> Is the snapshot and restore feature [2] useful in my use case, or not?
>>
>> Is the knapsack plugin able to backup and restore also aliases and
>> mappings, or do I have to manually migrate them before restoring data?
>>
>> Thanks for the patience and the great work!
>> Matteo
>>
>> [1] https://github.com/jprante/elasticsearch-knapsack
>> [2]
>> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-snapshots.html
>>
>> --
>> Matteo Moci
>> http://mox.fm
>>
>>  --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to elasticsearch+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/CAONgFZ60jWViqzRVO6_U-rYo6dUzunE3ojv%2BR5U8HX1Lwp4PdA%40mail.gmail.com
>> <https://groups.google.com/d/msgid/elasticsearch/CAONgFZ60jWViqzRVO6_U-rYo6dUzunE3ojv%2BR5U8HX1Lwp4PdA%40mail.gmail.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGrXthJjCchEf2oyvXKnSZyBp31nvnAeXwAZJaEkvnT5Q%40mail.gmail.com
> <https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGrXthJjCchEf2oyvXKnSZyBp31nvnAeXwAZJaEkvnT5Q%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>



-- 
Matteo Moci
http://mox.fm

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAONgFZ6PvJyeF04ERyeb26LhXoHR%3DMMs5sc5KV2ASLF_UK6b%3Dw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


knapsack use case

2014-07-31 Thread Matteo Moci
Hi All,
I have some questions about the knapsack plugin [1].

My idea to use the tool to do a backup to a file, starting from a 0.90.x
instance and then restore it on a different 1.2.x or 1.3.x instance. I see
it can't be done directly, copying to a local/remote cluster.

Would it work doing an intermediate step with a file?
Or the backup still has metadata about the es version it was generated
from, making it impossible?

Is the snapshot and restore feature [2] useful in my use case, or not?

Is the knapsack plugin able to backup and restore also aliases and
mappings, or do I have to manually migrate them before restoring data?

Thanks for the patience and the great work!
Matteo

[1] https://github.com/jprante/elasticsearch-knapsack
[2]
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-snapshots.html

-- 
Matteo Moci
http://mox.fm

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAONgFZ60jWViqzRVO6_U-rYo6dUzunE3ojv%2BR5U8HX1Lwp4PdA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: SearchContextMissingException when doing scan search [0.90.7-0.90.13]

2014-04-28 Thread Matteo Moci
Well spotted Joerg!
I knew the scroll id was changing,
anyway in the end I coded it wrong.

Thanks and keep up the good work!
Matteo




On Mon, Apr 28, 2014 at 1:29 PM, joergpra...@gmail.com <
joergpra...@gmail.com> wrote:

> Reviewing your code, you have made a small mistake.
>
> In the while loop, do not use currentScanSearchResp.getScrollId()
>
>
> https://gist.github.com/mox601/545c7176785ef209f7f3#file-scroll-search-java-L59
>
> This can get outdated in the while loop.
>
> Instead, use searchResponse.getScrollId()
>
> Background info: the scroll ID changes from response to response!
>
> Jörg
>
>
>
> On Mon, Apr 28, 2014 at 12:44 PM, Matteo Moci  wrote:
>
>> Hi Jorg,
>> Investigating on what call the exception relates to,
>> I ran the search on 2 types, and I'll show you the merged logs of my
>> client and the one printed on the node:
>>
>> [2014-04-28 12:09:07,*438*][DEBUG][action.search.type   ]
>> [moxbook-pro] [2] Failed to execute query phase
>> 2014-04-28 12:09:07,*442*/CEST [main] INFO finished scrolling, hits
>> length <=0
>> [2014-04-28 12:09:07,*455*][DEBUG][action.search.type   ]
>> [moxbook-pro] [3] Failed to execute query phase
>> 2014-04-28 12:09:07,*456*/CEST [main] INFO finished scrolling, hits
>> length <=0
>>
>> The client logs are printed by the client code (running on the same
>> machine),
>> once per type, after exiting the while (hits.length > 0) scroll's cycle
>> and
>> before calling the clear scroll at the end.
>>
>> So to wrap up, according to what I see, the error message
>> of the failing execute query phase happens before the clear scroll.
>>
>> Following another path and reading your first email,
>> I added to the configuration file elasticsearch.yml the line:
>>
>> search.keep_alive_interval: 1H
>> hoping to set the search keep alive explicitly on the node.
>>
>> Starting the node like this:
>> $ ./elasticsearch -f -Des.logger.level=DEBUG
>> gives this log [1], but can't find mention of the keep alive interval I
>> set up.
>>
>> Is it the right parameter name, and should it help?
>>
>> Thanks,
>> Matteo
>>
>> [1] https://gist.github.com/mox601/11368114
>>
>>
>>
>> On Mon, Apr 28, 2014 at 11:53 AM, joergpra...@gmail.com <
>> joergpra...@gmail.com> wrote:
>>
>>> OK, the "new Scroll(...)" should'nt make any difference
>>>
>>> Maybe the cause for theerror message is the clear scroll call at the
>>> end? If so, it shouldn't be serious.
>>>
>>> Jörg
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to elasticsearch+unsubscr...@googlegroups.com.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGUec-BG2%2BJz9VOjXMjjwEeqxqMyA2XNusWCmMBKt0f8w%40mail.gmail.com<https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGUec-BG2%2BJz9VOjXMjjwEeqxqMyA2XNusWCmMBKt0f8w%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>
>>
>> --
>> Matteo Moci
>> http://mox.fm
>>
>>  --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to elasticsearch+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/CAONgFZ5gcpx2nXUZ31qHWrLb_LB8vpXPCJsmUH9-HAMF34HDmw%40mail.gmail.com<https://groups.google.com/d/msgid/elasticsearch/CAONgFZ5gcpx2nXUZ31qHWrLb_LB8vpXPCJsmUH9-HAMF34HDmw%40mail.gmail.com?utm_medium=email&utm_source=footer>
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CAKdsXoH1pfcM3UgBmNe9Yq%2BS9S%3DOY7eo_P1urGs5Kj2wORv%2BWQ%40mail.gmail.com<https://groups.google.com/d/msgid/elasticsearch/CAKdsXoH1pfcM3UgBmNe9Yq%2BS9S%3DOY7eo_P1urGs5Kj2wORv%2BWQ%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>
> For more options, visit https://groups.google.com/d/optout.
>



-- 
Matteo Moci
http://mox.fm

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAONgFZ6XDw4c_2Lpax%3DBcM%2BTvtNCP4yfssbOTHs8R7_p1wiXmQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: SearchContextMissingException when doing scan search [0.90.7-0.90.13]

2014-04-28 Thread Matteo Moci
Hi Jorg,
Investigating on what call the exception relates to,
I ran the search on 2 types, and I'll show you the merged logs of my client
and the one printed on the node:

[2014-04-28 12:09:07,*438*][DEBUG][action.search.type   ] [moxbook-pro]
[2] Failed to execute query phase
2014-04-28 12:09:07,*442*/CEST [main] INFO finished scrolling, hits length
<=0
[2014-04-28 12:09:07,*455*][DEBUG][action.search.type   ] [moxbook-pro]
[3] Failed to execute query phase
2014-04-28 12:09:07,*456*/CEST [main] INFO finished scrolling, hits length
<=0

The client logs are printed by the client code (running on the same
machine),
once per type, after exiting the while (hits.length > 0) scroll's cycle and
before calling the clear scroll at the end.

So to wrap up, according to what I see, the error message
of the failing execute query phase happens before the clear scroll.

Following another path and reading your first email,
I added to the configuration file elasticsearch.yml the line:

search.keep_alive_interval: 1H
hoping to set the search keep alive explicitly on the node.

Starting the node like this:
$ ./elasticsearch -f -Des.logger.level=DEBUG
gives this log [1], but can't find mention of the keep alive interval I set
up.

Is it the right parameter name, and should it help?

Thanks,
Matteo

[1] https://gist.github.com/mox601/11368114



On Mon, Apr 28, 2014 at 11:53 AM, joergpra...@gmail.com <
joergpra...@gmail.com> wrote:

> OK, the "new Scroll(...)" should'nt make any difference
>
> Maybe the cause for theerror message is the clear scroll call at the end?
> If so, it shouldn't be serious.
>
> Jörg
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGUec-BG2%2BJz9VOjXMjjwEeqxqMyA2XNusWCmMBKt0f8w%40mail.gmail.com<https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGUec-BG2%2BJz9VOjXMjjwEeqxqMyA2XNusWCmMBKt0f8w%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>
> For more options, visit https://groups.google.com/d/optout.
>



-- 
Matteo Moci
http://mox.fm

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAONgFZ5gcpx2nXUZ31qHWrLb_LB8vpXPCJsmUH9-HAMF34HDmw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: SearchContextMissingException when doing scan search [0.90.7-0.90.13]

2014-04-28 Thread Matteo Moci
Hi Jorg,
Thanks for the reply.
If that helps, I linked a snippet with the code at [1].
In addition, the test I did was on 1 index, 2 shards with 0 replicas,
see the status at [2] and settings at [3].

Ho can I provide you more details?

Thanks,
Matteo

[1] https://gist.github.com/mox601/545c7176785ef209f7f3
[2] https://gist.github.com/mox601/fd806317dbe1d89ff12c
[3] https://gist.github.com/mox601/11366726



On Mon, Apr 28, 2014 at 10:56 AM, joergpra...@gmail.com <
joergpra...@gmail.com> wrote:

> The message is serious because it signals that your scan/scroll search has
> prematurely ended and responses may be dropped (or not).
>
> It would be nice to show us your code and your shard settings.
>
> search.keep_alive_interval is the parameter that can influence the search
> reaper, not search operation threading. The reaper runs each minute and
> might run too frequently. While not sure about your setting of scan/scroll
> lifetime, I can't tell more.
>
> Jörg
>
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHfGVg3xAeFXASg9%3DuiWr8w0pH24b6ucDvAfQeY_JC%3DPQ%40mail.gmail.com<https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHfGVg3xAeFXASg9%3DuiWr8w0pH24b6ucDvAfQeY_JC%3DPQ%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>



-- 
Matteo Moci
http://mox.fm

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAONgFZ7cMtUW%3Dyco3EMSA_2p9NPFN3KRAWuiaCA6MGHWv5R0kA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


SearchContextMissingException when doing scan search [0.90.7-0.90.13]

2014-04-28 Thread Matteo Moci
Hi All,
First of all, thanks for the nice work!

I found out that during a simple scan search (10 docs of 1 type, on 1 node,
1 index with 2 shards, no replicas, both v0.90.7 and v0.90.13),
I _always_ receive this exception in the node logs (and it happens once per
type when I have more):

[2014-04-28 10:22:07,937][DEBUG][action.search.type   ] [moxbook] [45]
Failed to execute query phase
org.elasticsearch.search.SearchContextMissingException: No search context
found for id [45]
at
org.elasticsearch.search.SearchService.findContext(SearchService.java:460)
at
org.elasticsearch.search.SearchService.executeScan(SearchService.java:211)
at
org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteScan(SearchServiceTransportAction.java:474)
at
org.elasticsearch.action.search.type.TransportSearchScrollScanAction$AsyncAction.executePhase(TransportSearchScrollScanAction.java:210)
at
org.elasticsearch.action.search.type.TransportSearchScrollScanAction$AsyncAction$2.run(TransportSearchScrollScanAction.java:180)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:722)

What worries me is that apparently in my tests, it is able to scan all the
docs with no missing records,
so I don't know if it's "normal" to have logs like this or if in some cases
it will lead to problems.

I already saw that there's a closed issue at [5165], but I don't know if
the fix
was already included in the releases I tried, 0.90.7 (surely not) and
0.90.13 (?).

Changing SearchOperationThreading doesn't help.

You can see the relevant snippet I used to get the exception at [1],
maybe there is something missing there?

Could someone help to find out if it's ok, or where the fix was introduced?
PS: I didn't upgrade to 1.x yet.

Best,
Matteo

[5165] https://github.com/elasticsearch/elasticsearch/issues/5165
[1] https://gist.github.com/mox601/545c7176785ef209f7f3

-- 
Matteo Moci
http://mox.fm

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAONgFZ5_Mx-3be461Pu2Sgn4jp0N4f_gbjKAGh2nwUf9vRj5pw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.