Re: Single node job

2014-08-27 Thread joergpra...@gmail.com
With ScheduledThreadPoolExecutor of java.util.concurrent, you can set the
thread pool to 1 and this ensures the serial execution. No need for Quartz.

You are correct, you need to orchestrate the plugin execution over all the
nodes where it is installed to prevent multiple distributed executions. A
variant of this is to execute a plugin only on the master node. When
implementing a custom action, you can define on which node an action is
executed, e.g. on master only. See TransportMasterNodeOperationAction,
which is used for e.g. cluster state update operation that makes only sense
when being executed on the master node. Such a custom action can be
triggered internally by a ScheduledThreadPoolExecutor.

Just for the records:

Not sure why you deny cron jobs. This is the best method you can choose by
far. My opinion is that a plugin is too clumsy for simple purging tasks
(unless you have an easy method for dynamic config / update across ES
versions of a plugin).  With a script from outside, wrapped into a cron
job, you are free to start/stop purging, config/update is much more
flexible, and there is no need for orchestration or master node selection.
It can also be maintained by non-Java developers/operators.

To avoid parallel execution you could easily use flock from util-linux

* * * * * /usr/bin/flock -n /var/tmp/mydocpurge.lock
/usr/local/bin/mydocpurge

In most cases, mydocpurge would consist of two curl executions, one for
searching the doc ids, then processing with jq into a JSON array of doc
ids, and the other curl call for doc deleting. Plus you can send email to
the admin.

My 2 cents.

Jörg


On Wed, Aug 27, 2014 at 12:00 PM, Pawel  wrote:

> Hi,
> I'm have to prepare a mechanism which is able to run a scheduled job in
> ES. This job will be responsible for periodical remove of unused documents.
> I think it is easy to achieve in simple plugin. The only problem I see is
> synchronization. I think it is reasonable that at a time only one job is
> running. Is there any way to do it with internal ES API (without using
> external software like zookeeper)?
>
> To answer possible suggestions. I don't want to and cannot use any
> external tools and run it as cron jobs.
>
> Any ideas?
>
> --
> Paweł Róg
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CAF9ZkbMb1iQvB%3Dw1YsQ90aoTrFvWfGeBwmQLFcyVxmeJQn2yxw%40mail.gmail.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoEcEVbgKnQLaUK3HjyhWhJO6J7XZv6hW_OY1rwO0ADkkw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Single node job

2014-08-27 Thread Pawel
Hi,
Thank you for your response but I think that's not the case.
TTL are not what I want to use. There is some logic which is used to make a
decision which documents are not active any more and should be remove (TTL
won't work here).

Quartz is only a scheduler and I don't think it is able to synchronize
distributed nodes (at least I don't quarts can do this). What i need is
something like "leader election" and only the leader triggers "delete"
jobs. When the leader dies, other node takes it's role. This is the reason
why I mentioned zookeeper but I don't want to use this - it is an external
software/moving part from ES point of view. ES also elects leader so I
thought I can use ES internal mechanisms.

--
Paweł


On Wed, Aug 27, 2014 at 12:13 PM, Mark Walkom 
wrote:

> It depends on what you are exactly doing but there are document TTLs that
> might suit, but they are resource intensive.
>
> A plugin could work as you could leverage the quartz scheduler libraries
> to handle running it.
>
> Regards,
> Mark Walkom
>
> Infrastructure Engineer
> Campaign Monitor
> email: ma...@campaignmonitor.com
> web: www.campaignmonitor.com
>
>
> On 27 August 2014 20:00, Pawel  wrote:
>
>> Hi,
>> I'm have to prepare a mechanism which is able to run a scheduled job in
>> ES. This job will be responsible for periodical remove of unused documents.
>> I think it is easy to achieve in simple plugin. The only problem I see is
>> synchronization. I think it is reasonable that at a time only one job is
>> running. Is there any way to do it with internal ES API (without using
>> external software like zookeeper)?
>>
>> To answer possible suggestions. I don't want to and cannot use any
>> external tools and run it as cron jobs.
>>
>> Any ideas?
>>
>> --
>> Paweł Róg
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to elasticsearch+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/CAF9ZkbMb1iQvB%3Dw1YsQ90aoTrFvWfGeBwmQLFcyVxmeJQn2yxw%40mail.gmail.com
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CAEM624ZPJtbLKRmU87VxFKXwgjycjjUyXQ4A%3DKQG1vRgjPcYWA%40mail.gmail.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAF9ZkbNWwQ9xdh%2Bwnop6M8LCJA0XL_ce3jUeEtOLnGOgZn3%2Big%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Single node job

2014-08-27 Thread Mark Walkom
It depends on what you are exactly doing but there are document TTLs that
might suit, but they are resource intensive.

A plugin could work as you could leverage the quartz scheduler libraries to
handle running it.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com


On 27 August 2014 20:00, Pawel  wrote:

> Hi,
> I'm have to prepare a mechanism which is able to run a scheduled job in
> ES. This job will be responsible for periodical remove of unused documents.
> I think it is easy to achieve in simple plugin. The only problem I see is
> synchronization. I think it is reasonable that at a time only one job is
> running. Is there any way to do it with internal ES API (without using
> external software like zookeeper)?
>
> To answer possible suggestions. I don't want to and cannot use any
> external tools and run it as cron jobs.
>
> Any ideas?
>
> --
> Paweł Róg
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CAF9ZkbMb1iQvB%3Dw1YsQ90aoTrFvWfGeBwmQLFcyVxmeJQn2yxw%40mail.gmail.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEM624ZPJtbLKRmU87VxFKXwgjycjjUyXQ4A%3DKQG1vRgjPcYWA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Single node job

2014-08-27 Thread Pawel
Hi,
I'm have to prepare a mechanism which is able to run a scheduled job in ES.
This job will be responsible for periodical remove of unused documents. I
think it is easy to achieve in simple plugin. The only problem I see is
synchronization. I think it is reasonable that at a time only one job is
running. Is there any way to do it with internal ES API (without using
external software like zookeeper)?

To answer possible suggestions. I don't want to and cannot use any external
tools and run it as cron jobs.

Any ideas?

--
Paweł Róg

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAF9ZkbMb1iQvB%3Dw1YsQ90aoTrFvWfGeBwmQLFcyVxmeJQn2yxw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.