Index creation on Plugin instantiation

2015-05-26 Thread Thomas
Hi,

I'm experimenting with elasticsearch plugins creation and I'm trying to 
create an index (if missing) on plugin startup.

I wanted to ask what is the best place to add the code snippet for code 
creation? I have added it at an injected binding  with Client as 
constructor parameter but i get the following error:

no known master node, scheduling a retry
> [2015-05-26 12:03:27,289][ERROR][bootstrap] {1.4.1}: 
> Initialization Failed ...
> 1) UncategorizedExecutionException[Failed execution]
> ExecutionException[java.lang.NullPointerException]
> NullPointerException
>

My guess is that Client is not ready yet to handle index creation requests, 
my code snippet is the following:

public class IndexCreator {
>
> private final String indexName;
> private final ESLogger LOG;
>
> @Inject
> public IndexCreator(Settings settings, Client client) {
> this.LOG = Loggers.getLogger(getClass(), settings);
> this.indexName = settings.get("metis.index.name", ".metis");
> 
>
   String indexName = ".metis-registry";
>
>IndicesExistsResponse resp = 
> client.admin().indices().prepareExists(indexName).get();
>
>if (!resp.isExists()) {
>client.admin().indices().prepareCreate(indexName).get();
>} 
>
}
> }
>

And I add this as binding to my module

public class MyModule extends AbstractModule {

private final Settings settings;

public MyModule(Settings settings) {
this.settings = Preconditions.checkNotNull(settings);
}

@Override
protected void configure() {
    bind(IndexCreator.class).asEagerSingleton();
}
}


But it produces the overmentioned error, any ideas?

Thanks,

Thomas

-- 
Please update your bookmarks! We have moved to https://discuss.elastic.co/
--- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/e7616e28-b6aa-45b0-989f-5ee7d55c02ca%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Kibana 4 - ability to select a date range on dashboard that is reflected in other visualizations

2015-04-14 Thread Thomas Bratt
Hi,

I am using Kibana 4 with a Date Histogram. I can select a time range with 
the mouse but the other visualizations on the dashboard do not seem to 
update.  I only have data from today which might be affecting things.

Would appreciate it if someone could tell me how to get this to work :)

Thomas

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/672d0d69-a84c-4e51-aff9-9302d6805215%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Kibana 4 - ability to see source data from Dashboard

2015-04-14 Thread Thomas Bratt
A colleague just pointed out that you can add a search to the dashboard. 
Seems to work :)

On Tuesday, 14 April 2015 14:57:43 UTC+1, Thomas Bratt wrote:
>
> Hi,
>
> I can't seem to get access to the original data by drilling down on the 
> visualizations on the dashboard. Am I missing something?
>
> Many thanks,
>
> Thomas
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/6b921ea9-404b-4f1a-9c11-b455304b7cb5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Kibana 4 - ability to see source data from Dashboard

2015-04-14 Thread Thomas Bratt
Hi,

I can't seem to get access to the original data by drilling down on the 
visualizations on the dashboard. Am I missing something?

Many thanks,

Thomas

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c79b522e-1524-40cf-a8fb-9670fec1b807%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Kibana: Mark warnings as "solved"

2015-04-08 Thread Thomas Güttler
I know how to use a programming language and I could do start a own project.

But I would like to avoid it, since it leads to "plubming". I guess other 
people have same use case,
and I would like to use (and improve) an existing project.

But I have not found any up to now.

How do other ELK users solve my use case?

I guess I am missing something.

Regards,
  Thomas Güttler


Am Mittwoch, 8. April 2015 11:02:35 UTC+2 schrieb James Green:
>
> Couldn't you update the document with a flag on a field?
>
> On 8 April 2015 at 09:43, Thomas Güttler > 
> wrote:
>
>> We are evaluating if ELK is the right tool for our logs and event 
>> messages.
>>
>> We need a way to mark warnings as "done". All warnings of this type 
>> should be invisible in the future.
>>
>> Use case:
>>
>> There was a bug in our code and the dev team has created a fix. 
>> Continuous Integration is running,
>> and soon the bug in the production system will be gone.
>>
>> We need a way to mark the warnings as "this type of warning is already 
>> handled, and the 
>> fix will be in the production system during the next three hours".
>>
>> Can you understand what I want?
>>
>> How to handle this with ELK?
>>
>> Just removing these logs from ElasticSearch is not a solution, since 
>> during the next hours (after
>> setting the flag "done") new events can still come into the system.
>>
>>
>>  -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/ff5e0583-3f1d-4ba4-af38-ee0a4823afc2%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/elasticsearch/ff5e0583-3f1d-4ba4-af38-ee0a4823afc2%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/6edd4558-7035-48d2-85b2-7e88f6571acc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Kibana: Mark warnings as "solved"

2015-04-08 Thread Thomas Güttler
We are evaluating if ELK is the right tool for our logs and event messages.

We need a way to mark warnings as "done". All warnings of this type should 
be invisible in the future.

Use case:

There was a bug in our code and the dev team has created a fix. Continuous 
Integration is running,
and soon the bug in the production system will be gone.

We need a way to mark the warnings as "this type of warning is already 
handled, and the 
fix will be in the production system during the next three hours".

Can you understand what I want?

How to handle this with ELK?

Just removing these logs from ElasticSearch is not a solution, since during 
the next hours (after
setting the flag "done") new events can still come into the system.


-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ff5e0583-3f1d-4ba4-af38-ee0a4823afc2%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


ELK for logfiles

2015-03-27 Thread Thomas Güttler
Hi,

I am planing to use ELK for our log files.

I read docs about logstash, elasticsearch and kibana.

Still the whole picture is not solid. 

Especially the reporting area is something I can't understand up to now.

Kibana seems to be a great tool to do the visualization. 

But can I get the single log for debugging the root of problems?

Example: I see that 99 systems work fine, and 1 systems emits warnings.

Which interface could I use the see the logs in ElasticSearch 
of this system?

Needed features:

Show all logs from system "foo" in the period between 2015-03-27 00:00 and 
00:10 (ten minutes).

Show all logs with log level "error" of system "foo" in day 2015-03-27

Is Kibana the right tool for this?

Or am I on the wrong track?

Which tool could be used to analyze log data in ElasticSearch?

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a03e8696-6320-4911-8f03-2f7f7a756a58%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Using ELK to analyze log warnings and exceptions - and mark them as "solved"

2015-03-13 Thread Thomas Güttler
We run several servers running our code.

Of course there are bugs which cause exceptions and warnings since 
something unusual occurs.

I want to  analyze our logs to find unhandled warnings.

I am unsure if ELK can help us.

There need to be some way to aggregate warnings to a warning of type X (to 
remove duplicates).

If a warning was handled and solved, we need a way to mark the warnings of 
type X as solved.

The flag should only be set for a limited period of time (example 48 
hours). During this
time the new code should be deployed and the error should nor occur again.

If it sill occurs after N hours the warning should be visible again.

Can you understand what I want?

Can this be done with ELK, or I am on the wrong track?

Regards,
  Thomas Güttler

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/d67d74ca-ab6a-4739-b119-63f52bbb7231%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Easy ELK Stack setup

2015-02-25 Thread Thomas Güttler
Hi,

I want to setup an ELK stack without wasting time. That's why I ask here 
before starting.

My environment is simple: all traffic comes in from localhost. There is 
only one server for the ELK setup.

But there will be several ELK stacks running in the future. But again each 
traffic will come in only from localhost.
The systems will run isolated. 

I see these solutions:

  - take a docker container

  - do it by hand (RPM install)

  - use Chef/Puppet. But up to now we don't use any of those tools.

  - any other idea?

What do you think?


Regards,
  Thomas Güttler

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/b82b581c-cb25-47f3-83f2-7f6877c21ec4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Leaving out information from the response

2015-02-25 Thread Thomas Matthijs
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-fields.html

On Wed, Feb 25, 2015 at 9:13 AM, James  wrote:

> Hi,
>
> I want to have certain data in my elasticsearch index but I don't want it
> to be returned with a query. At the moment it seems to return every bit of
> data I have for each index and then I use my PHP application to hide it. Is
> it possible to select what fields elasticsearch returns in it's response to
> my PHP application.
>
> For example for each time:
>
> Name
> Location
> Description
> Keywords
> Unique ID
> Create date
>
> I just want to have in the response from elasticsearch:
>
> Name
> Location
> Description
>
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/089c53f5-0aa4-48b5-acb4-df4d6ccfee13%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CABY_-Z4x%3DgOq1EbtvsLLRgQn1Ad8Zd5QzpubZaL5KA98p7J2Xw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Kibana 4 behind reverse proxy. Is it possible?

2015-02-06 Thread Cijo Thomas
I hit the same issue when accessing the site using DNS name. When I am in
the machine, http:localhost:/ works though. Have not figured out the
fix for this yet. It seems like Kibana 4 CORS issue.

On Thu, Jan 29, 2015 at 3:38 PM, Konstantin Erman  wrote:

> Yes, Kibana 4 beta 3. And I have just one URL rewrite rule (pictured).
> Were you getting the same error when it was not working for you?
>
>
> <https://lh3.googleusercontent.com/-oDiu_ncjJlA/VMrEJL-Qj_I/Aic/so2IvrgTQbY/s1600/RewriteRule.png>
>
>
> On Thursday, January 29, 2015 at 3:31:56 PM UTC-8, Cijo Thomas wrote:
>>
>> Can you show your URL rewrite rules ? Also  are you using Kibana 4 beta 3
>> ?
>>
>> On Thu, Jan 29, 2015 at 1:09 PM, Konstantin Erman 
>> wrote:
>>
>>> Unfortunately I could not replicate your success :-(
>>>
>>> Let me show you what I did in case you may be notice any difference from
>>> your case.
>>>
>>>
>>> <https://lh6.googleusercontent.com/-HzQRKhGl9ag/VMqfkWnSF8I/Ah0/SsXrJlQ2vW8/s1600/Output_Caching.png>
>>>
>>>
>>> <https://lh6.googleusercontent.com/-V2VTx-iT888/VMqf0K7jChI/Ah8/qC7umA0XP_U/s1600/AppPool1.png>
>>>
>>>
>>> <https://lh6.googleusercontent.com/-4jL3Hyoq0QY/VMqgF7d0-II/AiE/77VOeAZP2e0/s1600/AppPool2.png>
>>>
>>>
>>> <https://lh5.googleusercontent.com/-aBFCh_BZKn4/VMqgnM9ejhI/AiM/zxnsdD-VK8U/s1600/Error.png>
>>> Any ideas what I may be missing?
>>>
>>> Thanks!
>>> Konstantin
>>>
>>> On Thursday, January 29, 2015 at 10:13:40 AM UTC-8, Cijo Thomas wrote:
>>>>
>>>> I have been fighting with this for quite some time, Finally found the
>>>> workaround. Let me know if it helps you!
>>>>
>>>> On Thu, Jan 29, 2015 at 10:12 AM, Konstantin Erman 
>>>> wrote:
>>>>
>>>>> Thank you for the good news! I'm a little swamped currently, but I
>>>>> will definitely give it a try when I get a minute.
>>>>>
>>>>> Just to make sure - "disable Output cache for the website" - where is
>>>>> it in IIS Management Console?
>>>>>
>>>>>
>>>>> On Wednesday, January 28, 2015 at 4:38:01 PM UTC-8, Cijo Thomas wrote:
>>>>>>
>>>>>> Its possible to use IIS with the following steps.
>>>>>> 1) Disable Output cache for the website you are using as reverse
>>>>>> proxy.
>>>>>> 2) Run the website in a new apppool, which do not have any managed
>>>>>> code.
>>>>>>
>>>>>> With the above two steps, kibana4 runs fine with IIS as reverse proxy.
>>>>>>
>>>>>>
>>>>>> On Saturday, December 27, 2014 at 4:19:31 PM UTC-8, Konstantin Erman
>>>>>> wrote:
>>>>>>>
>>>>>>> We currently use Kibana 3 hosted in IIS behind IIS reverse proxy for
>>>>>>> auhentication. Naturally we look at Kibana 4 Beta 3 expecting it to 
>>>>>>> replace
>>>>>>> Kibana 3 soon. Kibana 4 is self hosted and works nicely when accessed
>>>>>>> directly, but we need authentication and whatever I do I cannot make it
>>>>>>> work from behind reverse proxy! Early or later I get 401 accessing some
>>>>>>> internal resource.
>>>>>>>
>>>>>>> Wonder if anybody hit similar problem and have any insight how to
>>>>>>> make it work.
>>>>>>> We cannot use Shield as its price is way beyond our bounds.
>>>>>>>
>>>>>>> Thanks!
>>>>>>> Konstantin
>>>>>>
>>>>>>  --
>>>>> You received this message because you are subscribed to a topic in the
>>>>> Google Groups "elasticsearch" group.
>>>>> To unsubscribe from this topic, visit https://groups.google.com/d/to
>>>>> pic/elasticsearch/r_uDcHR-rrw/unsubscribe.
>>>>> To unsubscribe from this group and all its topics, send an email to
>>>>> elasticsearc...@googlegroups.com.
>>>>> To view this discussion on the web visit https://groups.google.com/d/
>>>>> msgid/elasticsearch/d089ed71-71c4-4997-93fa-2dc7125b7f49%40goo
>>>>> glegroups.com
>>>>> <https://groups.google.com/d/msgid/elasticsearch/d089ed71-

Re: Kibana 4 behind reverse proxy. Is it possible?

2015-01-29 Thread Cijo Thomas
Can you show your URL rewrite rules ? Also  are you using Kibana 4 beta 3 ?

On Thu, Jan 29, 2015 at 1:09 PM, Konstantin Erman  wrote:

> Unfortunately I could not replicate your success :-(
>
> Let me show you what I did in case you may be notice any difference from
> your case.
>
>
> <https://lh6.googleusercontent.com/-HzQRKhGl9ag/VMqfkWnSF8I/Ah0/SsXrJlQ2vW8/s1600/Output_Caching.png>
>
>
> <https://lh6.googleusercontent.com/-V2VTx-iT888/VMqf0K7jChI/Ah8/qC7umA0XP_U/s1600/AppPool1.png>
>
>
> <https://lh6.googleusercontent.com/-4jL3Hyoq0QY/VMqgF7d0-II/AiE/77VOeAZP2e0/s1600/AppPool2.png>
>
>
> <https://lh5.googleusercontent.com/-aBFCh_BZKn4/VMqgnM9ejhI/AiM/zxnsdD-VK8U/s1600/Error.png>
> Any ideas what I may be missing?
>
> Thanks!
> Konstantin
>
> On Thursday, January 29, 2015 at 10:13:40 AM UTC-8, Cijo Thomas wrote:
>>
>> I have been fighting with this for quite some time, Finally found the
>> workaround. Let me know if it helps you!
>>
>> On Thu, Jan 29, 2015 at 10:12 AM, Konstantin Erman 
>> wrote:
>>
>>> Thank you for the good news! I'm a little swamped currently, but I will
>>> definitely give it a try when I get a minute.
>>>
>>> Just to make sure - "disable Output cache for the website" - where is it
>>> in IIS Management Console?
>>>
>>>
>>> On Wednesday, January 28, 2015 at 4:38:01 PM UTC-8, Cijo Thomas wrote:
>>>>
>>>> Its possible to use IIS with the following steps.
>>>> 1) Disable Output cache for the website you are using as reverse proxy.
>>>> 2) Run the website in a new apppool, which do not have any managed code.
>>>>
>>>> With the above two steps, kibana4 runs fine with IIS as reverse proxy.
>>>>
>>>>
>>>> On Saturday, December 27, 2014 at 4:19:31 PM UTC-8, Konstantin Erman
>>>> wrote:
>>>>>
>>>>> We currently use Kibana 3 hosted in IIS behind IIS reverse proxy for
>>>>> auhentication. Naturally we look at Kibana 4 Beta 3 expecting it to 
>>>>> replace
>>>>> Kibana 3 soon. Kibana 4 is self hosted and works nicely when accessed
>>>>> directly, but we need authentication and whatever I do I cannot make it
>>>>> work from behind reverse proxy! Early or later I get 401 accessing some
>>>>> internal resource.
>>>>>
>>>>> Wonder if anybody hit similar problem and have any insight how to make
>>>>> it work.
>>>>> We cannot use Shield as its price is way beyond our bounds.
>>>>>
>>>>> Thanks!
>>>>> Konstantin
>>>>
>>>>  --
>>> You received this message because you are subscribed to a topic in the
>>> Google Groups "elasticsearch" group.
>>> To unsubscribe from this topic, visit https://groups.google.com/d/
>>> topic/elasticsearch/r_uDcHR-rrw/unsubscribe.
>>> To unsubscribe from this group and all its topics, send an email to
>>> elasticsearc...@googlegroups.com.
>>> To view this discussion on the web visit https://groups.google.com/d/
>>> msgid/elasticsearch/d089ed71-71c4-4997-93fa-2dc7125b7f49%
>>> 40googlegroups.com
>>> <https://groups.google.com/d/msgid/elasticsearch/d089ed71-71c4-4997-93fa-2dc7125b7f49%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>
>>
>> --
>> Warm Regards,
>> Cijo Thomas
>> +1 3125606441
>> <#14b3786b590d6772_CAH6LTpEkRjF1kDbKEm5Frvwb6BBCH_Xhvm1hhKtMYtYCphrraQ@mail.gmail.com_SafeHtmlFilter_>
>>
>  --
> You received this message because you are subscribed to a topic in the
> Google Groups "elasticsearch" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/elasticsearch/r_uDcHR-rrw/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/cbf5e8d0-6769-4ef5-b625-9a6457fac86c%40googlegroups.com
> <https://groups.google.com/d/msgid/elasticsearch/cbf5e8d0-6769-4ef5-b625-9a6457fac86c%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>
> For more options, visit https://groups.google.com/d/optout.
>



-- 
Warm Regards,
Cijo Thomas
+1 3125606441 <#SafeHtmlFilter_>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAH6LTpE1RoLVPN1aQWeJm-6nyWiovzr%3DhGudAmUeAGGgBAuYWg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Kibana 4 behind reverse proxy. Is it possible?

2015-01-29 Thread Cijo Thomas
I have been fighting with this for quite some time, Finally found the
workaround. Let me know if it helps you!

On Thu, Jan 29, 2015 at 10:12 AM, Konstantin Erman  wrote:

> Thank you for the good news! I'm a little swamped currently, but I will
> definitely give it a try when I get a minute.
>
> Just to make sure - "disable Output cache for the website" - where is it
> in IIS Management Console?
>
>
> On Wednesday, January 28, 2015 at 4:38:01 PM UTC-8, Cijo Thomas wrote:
>>
>> Its possible to use IIS with the following steps.
>> 1) Disable Output cache for the website you are using as reverse proxy.
>> 2) Run the website in a new apppool, which do not have any managed code.
>>
>> With the above two steps, kibana4 runs fine with IIS as reverse proxy.
>>
>>
>> On Saturday, December 27, 2014 at 4:19:31 PM UTC-8, Konstantin Erman
>> wrote:
>>>
>>> We currently use Kibana 3 hosted in IIS behind IIS reverse proxy for
>>> auhentication. Naturally we look at Kibana 4 Beta 3 expecting it to replace
>>> Kibana 3 soon. Kibana 4 is self hosted and works nicely when accessed
>>> directly, but we need authentication and whatever I do I cannot make it
>>> work from behind reverse proxy! Early or later I get 401 accessing some
>>> internal resource.
>>>
>>> Wonder if anybody hit similar problem and have any insight how to make
>>> it work.
>>> We cannot use Shield as its price is way beyond our bounds.
>>>
>>> Thanks!
>>> Konstantin
>>
>>  --
> You received this message because you are subscribed to a topic in the
> Google Groups "elasticsearch" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/elasticsearch/r_uDcHR-rrw/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/d089ed71-71c4-4997-93fa-2dc7125b7f49%40googlegroups.com
> <https://groups.google.com/d/msgid/elasticsearch/d089ed71-71c4-4997-93fa-2dc7125b7f49%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>
> For more options, visit https://groups.google.com/d/optout.
>



-- 
Warm Regards,
Cijo Thomas
+1 3125606441 <#SafeHtmlFilter_>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAH6LTpEkRjF1kDbKEm5Frvwb6BBCH_Xhvm1hhKtMYtYCphrraQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Kibana 4 behind reverse proxy. Is it possible?

2015-01-28 Thread Cijo Thomas
Its possible to use IIS with the following steps.
1) Disable Output cache for the website you are using as reverse proxy.
2) Run the website in a new apppool, which do not have any managed code.

With the above two steps, kibana4 runs fine with IIS as reverse proxy.


On Saturday, December 27, 2014 at 4:19:31 PM UTC-8, Konstantin Erman wrote:
>
> We currently use Kibana 3 hosted in IIS behind IIS reverse proxy for 
> auhentication. Naturally we look at Kibana 4 Beta 3 expecting it to replace 
> Kibana 3 soon. Kibana 4 is self hosted and works nicely when accessed 
> directly, but we need authentication and whatever I do I cannot make it 
> work from behind reverse proxy! Early or later I get 401 accessing some 
> internal resource. 
>
> Wonder if anybody hit similar problem and have any insight how to make it 
> work. 
> We cannot use Shield as its price is way beyond our bounds. 
>
> Thanks! 
> Konstantin

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/7355f0ec-1bac-4e62-b675-a5f23d04ef7a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Out of memory on start with 38GB index

2015-01-15 Thread Thomas Cataldo
tes(BufferedUpdatesStream.java:287)

at 
org.apache.lucene.index.IndexWriter.applyAllDeletesAndUpdates(IndexWriter.java:3271)

at 
org.apache.lucene.index.IndexWriter.maybeApplyDeletes(IndexWriter.java:3262)

at org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:421)

at 
org.apache.lucene.index.StandardDirectoryReader.doOpenFromWriter(StandardDirectoryReader.java:292)

at 
org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:267)

at 
org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:257)

at 
org.apache.lucene.index.DirectoryReader.openIfChanged(DirectoryReader.java:171)

at 
org.apache.lucene.search.SearcherManager.refreshIfNeeded(SearcherManager.java:118)

at 
org.apache.lucene.search.SearcherManager.refreshIfNeeded(SearcherManager.java:58)

at 
org.apache.lucene.search.ReferenceManager.doMaybeRefresh(ReferenceManager.java:176)

at 
org.apache.lucene.search.ReferenceManager.maybeRefresh(ReferenceManager.java:225)

at 
org.elasticsearch.index.engine.internal.InternalEngine.refresh(InternalEngine.java:796)

at 
org.elasticsearch.index.engine.internal.InternalEngine.delete(InternalEngine.java:692)

at 
org.elasticsearch.index.shard.service.InternalIndexShard.performRecoveryOperation(InternalIndexShard.java:798)

at 
org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:268)

at 
org.elasticsearch.index.gateway.IndexShardGatewayService$1.run(IndexShardGatewayService.java:132)

at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

at java.lang.Thread.run(Thread.java:745)

[2015-01-14 12:01:32,238][DEBUG][index.service] [Saint Elmo] 
[mailspool] [1] closing... (reason: [engine failure, message [refresh 
failed][OutOfMemoryError[Java heap space]]])

[2015-01-14 12:01:32,238][DEBUG][index.shard.service  ] [Saint Elmo] 
[mailspool][1] state: [RECOVERING]->[CLOSED], reason [engine failure, 
message [refresh failed][OutOfMemoryError[Java heap space]]]

[2015-01-14 12:01:32,315][DEBUG][index.service] [Saint Elmo] 
[mailspool] [1] closed (reason: [engine failure, message [refresh 
failed][OutOfMemoryError[Java heap space]]])


I tried adding a few settings to my elasticsearch.yml as suggested in the 
referenced issue :

index.load_fixed_bitset_filters_eagerly: false

index.warmer.enabled: false
indices.breaker.total.limit: 30% 

But none of this settings seems to work for me.

Our mapping is visible here 
: 
http://git.blue-mind.net/gitlist/bluemind/blob/master/esearch/config/templates/mailspool.json

It is used to store a full text index of emails. It uses parent / child 
structure :
The msgBody type contains the full text of the messages and attachments.
The msg type contains user flags (unread, important, folder it is store 
into, etc).

We use this structure as "msg" is often updated : mails are often marked as 
read or moved. The msgBody can be pretty big so we don't want to update the 
whole document when a simple email flag is changed.

Does this kind of index structure reminds a particular bug or required 
setting ? Any rule of thumb to size memory regarding to index size on disk ?

Regards,
Thomas.




-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/580ef748-abe9-44f6-ab4e-e388fe5b7803%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: out of memory at startup with large index and parent/child relation

2015-01-14 Thread Thomas Cataldo
Hi,

By removing all my translog files, ES can start without error.


On Wednesday, January 14, 2015 at 2:56:48 PM UTC+1, Thomas Cataldo wrote:
>
> Hi,
>
> I encounter a problem with a large index (38GB) that prevents ES 1.4.2 
> from starting.
> The problem looks pretty similar to the one in 
> https://github.com/elasticsearch/elasticsearch/issues/8394
>
> I tried some of the recommandations from this post (and linked ones) :
>
> index.load_fixed_bitset_filters_eagerly: false
> index.warmer.enabled: false
> indices.breaker.total.limit: 30%
>
> And event with that, my server does not start [1].
>
> I uploaded to gist the mapping for the index : 
> https://gist.github.com/tcataldo/c0b6b3dfec9823bf6523
>
> I tried several OS memory, ES heap combinations, the biggest being
> 48GiB for the operating system and 32GiB for ES heap and it still
> fails with that.
>
> Any idea or link to an open issue I could follow ?
>
> Regards,
> Thomas.
>
>
>
> 1. debug output:
>
> [2015-01-14 12:01:55,740][DEBUG][indices.cluster  ] [Saint Elmo] 
> [mailspool][0] creating shard
> [2015-01-14 12:01:55,741][DEBUG][index.service] [Saint Elmo] 
> [mailspool] creating shard_id [0]
> [2015-01-14 12:01:56,041][DEBUG][index.deletionpolicy ] [Saint Elmo] 
> [mailspool][0] Using [keep_only_last] deletion policy
> [2015-01-14 12:01:56,041][DEBUG][index.merge.policy   ] [Saint Elmo] 
> [mailspool][0] using [tiered] merge mergePolicy with 
> expunge_deletes_allowed[10.0], floor_segment[2mb], max_merge_at_on\
> ce[10], max_merge_at_once_explicit[30], max_merged_segment[5gb], 
> segments_per_tier[10.0], reclaim_deletes_weight[2.0]
> [2015-01-14 12:01:56,041][DEBUG][index.merge.scheduler] [Saint Elmo] 
> [mailspool][0] using [concurrent] merge scheduler with max_thread_count[2], 
> max_merge_count[4]
> [2015-01-14 12:01:56,042][DEBUG][index.shard.service  ] [Saint Elmo] 
> [mailspool][0] state: [CREATED]
> [2015-01-14 12:01:56,043][DEBUG][index.translog   ] [Saint Elmo] 
> [mailspool][0] interval [5s], flush_threshold_ops [2147483647], 
> flush_threshold_size [200mb], flush_threshold_period [3\
> 0m]
> [2015-01-14 12:01:56,044][DEBUG][index.shard.service  ] [Saint Elmo] 
> [mailspool][0] state: [CREATED]->[RECOVERING], reason [from gateway]
> [2015-01-14 12:01:56,044][DEBUG][index.gateway] [Saint Elmo] 
> [mailspool][0] starting recovery from local ...
> [2015-01-14 12:01:56,048][DEBUG][river.cluster] [Saint Elmo] 
> processing [reroute_rivers_node_changed]: execute
> [2015-01-14 12:01:56,048][DEBUG][river.cluster] [Saint Elmo] 
> processing [reroute_rivers_node_changed]: no change in cluster_state
> [2015-01-14 12:01:56,048][DEBUG][cluster.service  ] [Saint Elmo] 
> processing [shard-failed ([mailspool][3], node[gOgAuHo4SXyfyuPpws0Usw], 
> [P], s[INITIALIZING]), reason [engine failure, \
> message [refresh failed][OutOfMemoryError[Java heap space: done 
> applying updated cluster_state (version: 4)
> [2015-01-14 12:01:56,062][DEBUG][index.engine.internal] [Saint Elmo] 
> [mailspool][0] starting engine
> [2015-01-14 12:02:19,701][WARN ][index.engine.internal] [Saint Elmo] 
> [mailspool][0] failed engine [refresh failed]
> java.lang.OutOfMemoryError: Java heap space
> at org.apache.lucene.util.FixedBitSet.(FixedBitSet.java:187)
> at 
> org.apache.lucene.search.MultiTermQueryWrapperFilter.getDocIdSet(MultiTermQueryWrapperFilter.java:104)
> at 
> org.elasticsearch.index.cache.filter.weighted.WeightedFilterCache$FilterCacheFilterWrapper.getDocIdSet(WeightedFilterCache.java:177)
> at 
> org.elasticsearch.common.lucene.search.OrFilter.getDocIdSet(OrFilter.java:55)
> at 
> org.elasticsearch.common.lucene.search.ApplyAcceptedDocsFilter.getDocIdSet(ApplyAcceptedDocsFilter.java:46)
> at 
> org.apache.lucene.search.FilteredQuery$1.scorer(FilteredQuery.java:130)
> at 
> org.apache.lucene.search.FilteredQuery$RandomAccessFilterStrategy.filteredScorer(FilteredQuery.java:542)
> at 
> org.apache.lucene.search.FilteredQuery$1.scorer(FilteredQuery.java:136)
> at 
> org.apache.lucene.search.QueryWrapperFilter$1.iterator(QueryWrapperFilter.java:59)
> at 
> org.apache.lucene.index.BufferedUpdatesStream.applyQueryDeletes(BufferedUpdatesStream.java:554)
> at 
> org.apache.lucene.index.BufferedUpdatesStream.applyDeletesAndUpdates(BufferedUpdatesStream.java:287)
> at 
> org.apache.lucene.index.IndexWriter.applyAllDeletesAndUpdates(IndexWriter.java:3271)
> at 
> org.apache.lucene.index.IndexWriter.maybeApplyDeletes(IndexWriter.java:3262)
> at 
> org.apache.luce

out of memory at startup with large index and parent/child relation

2015-01-14 Thread Thomas Cataldo
Hi,

I encounter a problem with a large index (38GB) that prevents ES 1.4.2 from 
starting.
The problem looks pretty similar to the one in 
https://github.com/elasticsearch/elasticsearch/issues/8394

I tried some of the recommandations from this post (and linked ones) :

index.load_fixed_bitset_filters_eagerly: false
index.warmer.enabled: false
indices.breaker.total.limit: 30%

And event with that, my server does not start [1].

I uploaded to gist the mapping for the index : 
https://gist.github.com/tcataldo/c0b6b3dfec9823bf6523

I tried several OS memory, ES heap combinations, the biggest being
48GiB for the operating system and 32GiB for ES heap and it still
fails with that.

Any idea or link to an open issue I could follow ?

Regards,
Thomas.



1. debug output:

[2015-01-14 12:01:55,740][DEBUG][indices.cluster  ] [Saint Elmo] 
[mailspool][0] creating shard
[2015-01-14 12:01:55,741][DEBUG][index.service] [Saint Elmo] 
[mailspool] creating shard_id [0]
[2015-01-14 12:01:56,041][DEBUG][index.deletionpolicy ] [Saint Elmo] 
[mailspool][0] Using [keep_only_last] deletion policy
[2015-01-14 12:01:56,041][DEBUG][index.merge.policy   ] [Saint Elmo] 
[mailspool][0] using [tiered] merge mergePolicy with 
expunge_deletes_allowed[10.0], floor_segment[2mb], max_merge_at_on\
ce[10], max_merge_at_once_explicit[30], max_merged_segment[5gb], 
segments_per_tier[10.0], reclaim_deletes_weight[2.0]
[2015-01-14 12:01:56,041][DEBUG][index.merge.scheduler] [Saint Elmo] 
[mailspool][0] using [concurrent] merge scheduler with max_thread_count[2], 
max_merge_count[4]
[2015-01-14 12:01:56,042][DEBUG][index.shard.service  ] [Saint Elmo] 
[mailspool][0] state: [CREATED]
[2015-01-14 12:01:56,043][DEBUG][index.translog   ] [Saint Elmo] 
[mailspool][0] interval [5s], flush_threshold_ops [2147483647], 
flush_threshold_size [200mb], flush_threshold_period [3\
0m]
[2015-01-14 12:01:56,044][DEBUG][index.shard.service  ] [Saint Elmo] 
[mailspool][0] state: [CREATED]->[RECOVERING], reason [from gateway]
[2015-01-14 12:01:56,044][DEBUG][index.gateway] [Saint Elmo] 
[mailspool][0] starting recovery from local ...
[2015-01-14 12:01:56,048][DEBUG][river.cluster] [Saint Elmo] 
processing [reroute_rivers_node_changed]: execute
[2015-01-14 12:01:56,048][DEBUG][river.cluster] [Saint Elmo] 
processing [reroute_rivers_node_changed]: no change in cluster_state
[2015-01-14 12:01:56,048][DEBUG][cluster.service  ] [Saint Elmo] 
processing [shard-failed ([mailspool][3], node[gOgAuHo4SXyfyuPpws0Usw], 
[P], s[INITIALIZING]), reason [engine failure, \
message [refresh failed][OutOfMemoryError[Java heap space: done 
applying updated cluster_state (version: 4)
[2015-01-14 12:01:56,062][DEBUG][index.engine.internal] [Saint Elmo] 
[mailspool][0] starting engine
[2015-01-14 12:02:19,701][WARN ][index.engine.internal] [Saint Elmo] 
[mailspool][0] failed engine [refresh failed]
java.lang.OutOfMemoryError: Java heap space
at org.apache.lucene.util.FixedBitSet.(FixedBitSet.java:187)
at 
org.apache.lucene.search.MultiTermQueryWrapperFilter.getDocIdSet(MultiTermQueryWrapperFilter.java:104)
at 
org.elasticsearch.index.cache.filter.weighted.WeightedFilterCache$FilterCacheFilterWrapper.getDocIdSet(WeightedFilterCache.java:177)
at 
org.elasticsearch.common.lucene.search.OrFilter.getDocIdSet(OrFilter.java:55)
at 
org.elasticsearch.common.lucene.search.ApplyAcceptedDocsFilter.getDocIdSet(ApplyAcceptedDocsFilter.java:46)
at 
org.apache.lucene.search.FilteredQuery$1.scorer(FilteredQuery.java:130)
at 
org.apache.lucene.search.FilteredQuery$RandomAccessFilterStrategy.filteredScorer(FilteredQuery.java:542)
at 
org.apache.lucene.search.FilteredQuery$1.scorer(FilteredQuery.java:136)
at 
org.apache.lucene.search.QueryWrapperFilter$1.iterator(QueryWrapperFilter.java:59)
at 
org.apache.lucene.index.BufferedUpdatesStream.applyQueryDeletes(BufferedUpdatesStream.java:554)
at 
org.apache.lucene.index.BufferedUpdatesStream.applyDeletesAndUpdates(BufferedUpdatesStream.java:287)
at 
org.apache.lucene.index.IndexWriter.applyAllDeletesAndUpdates(IndexWriter.java:3271)
at 
org.apache.lucene.index.IndexWriter.maybeApplyDeletes(IndexWriter.java:3262)
at 
org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:421)
at 
org.apache.lucene.index.StandardDirectoryReader.doOpenFromWriter(StandardDirectoryReader.java:292)
at 
org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:267)
at 
org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:257)
at 
org.apache.lucene.index.DirectoryReader.openIfChanged(DirectoryReader.java:171)
at 
org.apache.lucene.search.SearcherManager.refreshIfNeeded(SearcherManager.java:118)
   

Re: RepositoryMissingException when restoring into a new cluster

2015-01-06 Thread Thomas Ardal
That was exactly what I was missing. I didn't create the repository named 
"elasticsearch_logs" on cluster B. After I created it, the backup runs 
smoothly.

Thanks, David!

On Wednesday, January 7, 2015 8:31:08 AM UTC+1, David Pilato wrote:
>
> Did you create the repository on cluster B?
> How?
>
> --
> David ;-)
> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
>
> Le 7 janv. 2015 à 08:19, Thomas Ardal > 
> a écrit :
>
> I'm using the snapshot/restore feature of Elasticsearch, together with the 
> Azure plugin to backup snapshots to Azure blob storage. Everything works 
> when doing snapshots from a cluster and restoring to the same cluster. Now 
> I'm in a situation where I want to restore an entirely new cluster (let's 
> call that cluster b) from a snapshot generated from cluster a. When I run a 
> restore request on cluster b, I get a 404. Doing a _status call on the 
> snapshot, I get the same error:
>
> {"error":"RepositoryMissingException[[elasticsearch_logs] 
> missing]","status":404}
>
>
> The new cluster is configured with the Azure plugin and the same settings for 
> Azure. I guess the error is caused by the fact, that Elasticsearch generates 
> some metadata about the snapshots and store them locally in a the _snapshot 
> index and this index is not on the new cluster. The same error is happening 
> if I delete the data dir on cluster a and try to restore cluster a from a 
> snapshot.
>
>
> How would I deal with a situation like this?
>
>  -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearc...@googlegroups.com .
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/b5a0147c-cc38-4947-8530-0c66eb00fc2a%40googlegroups.com
>  
> <https://groups.google.com/d/msgid/elasticsearch/b5a0147c-cc38-4947-8530-0c66eb00fc2a%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/2b1da7ff-9dd5-49ab-bd1f-34b38b1e0645%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


RepositoryMissingException when restoring into a new cluster

2015-01-06 Thread Thomas Ardal
I'm using the snapshot/restore feature of Elasticsearch, together with the 
Azure plugin to backup snapshots to Azure blob storage. Everything works 
when doing snapshots from a cluster and restoring to the same cluster. Now 
I'm in a situation where I want to restore an entirely new cluster (let's 
call that cluster b) from a snapshot generated from cluster a. When I run a 
restore request on cluster b, I get a 404. Doing a _status call on the 
snapshot, I get the same error:

{"error":"RepositoryMissingException[[elasticsearch_logs] 
missing]","status":404}


The new cluster is configured with the Azure plugin and the same settings for 
Azure. I guess the error is caused by the fact, that Elasticsearch generates 
some metadata about the snapshots and store them locally in a the _snapshot 
index and this index is not on the new cluster. The same error is happening if 
I delete the data dir on cluster a and try to restore cluster a from a snapshot.


How would I deal with a situation like this?

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/b5a0147c-cc38-4947-8530-0c66eb00fc2a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elasticsearch Frontend webapp and boilerplate query code

2015-01-02 Thread Thomas
Yes,

Let's say that you wan to represent a pie graph with some short of 
aggregated data in it extracted from elasticsearch. Instead of writing the 
query in javascript or having it client side in the code we need something 
like a simple call of a get API for instance getMyPieData() from another 
service and get the data to represent that information.

If we let elasticsearch to do that we may want to use search templates:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-template.html#pre-registered-templates

if we want to as well cache that query we may use elasticsearch query cache:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/index-modules-shard-query-cache.html#index-modules-shard-query-cache

That is what elasticsearch more or less provides for my use case afaik. My 
question is if there is something else that I can do to cache and "hide" my 
queries apart from the over-mentioned solutions which make me bound to 
elasticsearch(some separate module, technology etc. based on best 
practises), while in another case I might want to change elasticsearch with 
something else and still not affect my frontend code, or more important not 
to constantly hit elasticsearch for those data.

I'm trying to first verify that I will not reinvent the wheel and build my 
own solution

Thank you again 

On Friday, 2 January 2015 17:04:59 UTC+2, Thomas wrote:
>
> Hi,
>
> I wish everybody a happy new year, all the best for 2015, and in 
> continuation of the great success of ES,
>
> In our project we intend to create a simple webapp that will query 
> elasticsearch for insights. We do not want to directly query elasticsearch 
> for two reasons:
>
>- security
>- avoid boilerplate query code and to be able to decouple it
>
> What is the best way to achieve that? We are currently evaluating building 
> the frontend in python/django project. has anyone faced similiar task and 
> is it possible to share some thoughts?
>
> In other situations NGinX was a solution for security, but for avoiding 
> having all the boilerplate query code in client side (e.g. in javascript) 
> what is the most well established way?
>
> Finally, there are cases were some caching may be needed to avoid hitting 
> elasticsearch constantly for the same data, how is this tackled? We need to 
> build our own module to do all these?
>
> thank you in advance
>
> Thomas
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/2ba94df5-7f54-4ec3-87f5-7a038dd15ca7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Elasticsearch Frontend webapp and boilerplate query code

2015-01-02 Thread Thomas
Hi,

I wish everybody a happy new year, all the best for 2015, and in 
continuation of the great success of ES,

In our project we intend to create a simple webapp that will query 
elasticsearch for insights. We do not want to directly query elasticsearch 
for two reasons:

   - security
   - avoid boilerplate query code and to be able to decouple it

What is the best way to achieve that? We are currently evaluating building 
the frontend in python/django project. has anyone faced similiar task and 
is it possible to share some thoughts?

In other situations NGinX was a solution for security, but for avoiding 
having all the boilerplate query code in client side (e.g. in javascript) 
what is the most well established way?

Finally, there are cases were some caching may be needed to avoid hitting 
elasticsearch constantly for the same data, how is this tackled? We need to 
build our own module to do all these?

thank you in advance

Thomas

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/7bdce45e-eff8-4f04-9a9b-f3256a9b28a9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: using a nested object field in a multi_match query

2014-12-30 Thread thomas . vaughan


On Wednesday, December 10, 2014 4:33:12 PM UTC-3, thomas@beatport.com 
wrote:
>
>
>
> On Monday, August 11, 2014 1:29:56 PM UTC-4, Mike Topper wrote:
>>
>> Hello,
>>
>> I'm having trouble coming up with how to supply a field within a nested 
>> object in the multi_match fields list.  I'm using the multi_match query in 
>> order to perform query time field boosting, but something like:
>>
>>
>>   "query": {
>> "multi_match": {
>>   "query": "China Mieville",
>>   "operator": "and",
>>   "fields": [
>> "_all", "title^2", "author.name^1.5"
>>   ]
>> }
>>   }
>>
>> doesn't seem to work.  the title is boosted fine but in fact if i take 
>> out the "_all" field then i can see that author.name is never being 
>> used.  is there a way to supply nested fields within a multi_match query?
>>
>
> I've just been bit by this too. Anyone know how to make this work?
>

In our case we switched the mapping type from "nested" to "object" and then 
this worked. I'm aware of the implications of this switch. We don't need 
the features provided by "nested". Others may, of course.

Thanks.

-Tom

 

>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/857a9674-4661-4730-9ec8-79ba3426a603%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: using a nested object field in a multi_match query

2014-12-10 Thread thomas . vaughan


On Monday, August 11, 2014 1:29:56 PM UTC-4, Mike Topper wrote:
>
> Hello,
>
> I'm having trouble coming up with how to supply a field within a nested 
> object in the multi_match fields list.  I'm using the multi_match query in 
> order to perform query time field boosting, but something like:
>
>
>   "query": {
> "multi_match": {
>   "query": "China Mieville",
>   "operator": "and",
>   "fields": [
> "_all", "title^2", "author.name^1.5"
>   ]
> }
>   }
>
> doesn't seem to work.  the title is boosted fine but in fact if i take out 
> the "_all" field then i can see that author.name is never being used.  is 
> there a way to supply nested fields within a multi_match query?
>

I've just been bit by this too. Anyone know how to make this work?

Thanks.

-Tom


-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/79272696-3745-4ce7-93e3-44d5b4cdd75e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Performance issue while indexing lot of documents

2014-11-06 Thread Thomas Matthijs
On Thu, Nov 6, 2014 at 11:09 AM, Moshe Recanati  wrote:

> // bulkRequest = client.prepareBulk();



Please fix your code to clearly only send 1000 in a bulk request.
Looks like you are just increasnig the size of the bulk request now and
executing it over and over

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CABY_-Z7zNQJAqMdpry2hvuyK80UY-XPFHRw1YR23SKcrrZ%2B6%3DQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Modify the index setting after the index created ? what's the function of "search_quote_analyzer" ?

2014-10-27 Thread Thomas Christie
Bump, I'm having the same problem.

On Thursday, June 12, 2014 10:32:14 PM UTC-5, Ivan Ji wrote:
>
> Hi all,
>
> I want to modify one field's search analyzer from "standard" to "keyword" 
> after the index created. So I try to PUT mapping :
>
> $ curl -XPUT 'http://localhost:9200/qindex/main/_mapping' -d '
>> {
>> "main" : {
>> "properties" : {
>> "name" : { "type": "string", "index": "analyzed", 
>> "index_analyzer": "filename_ngram", "search_analyzer": "keyword"}
>> }
>> }
>> }
>> '
>
>
> The operation seems succeed. Because I expect it might conflict, what 
> would the situations that conflict might occur? This is my first question.
>
> Anyway then I try to get the mapping out: (partial)
>
>   "name": {
>> "type": "string",
>> "index_analyzer": "filename_ngram",
>> "search_analyzer": "keyword",
>> "include_in_all": true,
>> "search_quote_analyzer": "standard"
>> }
>
>
>  So I am wondering whether my operation succeeded? and what is the 
> "search_quote_analyzer" function?  And it still remains "standard", does it 
> matter?
>
> Could anyone answer me these questions?
>
> Cheers,
>
> Ivan
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/d3ec9347-931e-43bf-a199-d667a43f42a8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elasticsearch script execution on missing field

2014-09-17 Thread Thomas
I think the correct way to see if there is a missing field is the following

doc['countryid'].empty == true

Check also:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-scripting.html#_document_fields

btw why such an old version of ES?

Thomas

On Wednesday, 17 September 2014 13:53:08 UTC+3, Manoj wrote:
>
> I am currently using ES version 0.19. For a feature requirement, I wanted 
> to execute script on missing field through terms facet. The curl which I 
> tried is something like below
>
> 
> {
> "query": {
> "term": {
> "content": "deep"
> }
> },
> "filter": {
> "and": {
> "filters": [
> {
> "type": {
> "value": "twitter"
> }
> }
> ]
> }
> },
> "facets": {
> "loca": {
> "terms": {
> "field": "countryid",
> "script": "doc['countryid']==null?1:doc['countryid'].value"
> }
> }
> }
> }
> 
>
> I assume that missing fields can be accessed by the condition 
> doc['countryid']==null. But it looks like this is not the way to identify 
> the missing field in script :-(
>
> For which am always receiving response as missing
>
> {
>   "took" : 1,
>   "timed_out" : false,
>   "_shards" : {
> "total" : 6,
> "successful" : 6,
> "failed" : 0
>   },
>   "hits" : {
> "total" : 0,
> "max_score" : null,
> "hits" : [ ]
>   },
>   "facets" : {
> "loca" : {
>   "_type" : "terms",
>   "missing" : 1,
>   "total" : 0,
>   "other" : 0,
>   "terms" : [ ]
> }
>   }
> }
>
> Could anybody help me to get this correct.
>
> Thanks in advance, Manoj
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/b5a1acee-05c1-412d-b10a-d4235cd0b628%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Using ES as a primary datastore.

2014-09-17 Thread Thomas
Hi,

You have to calculate the volumes you will keep in one shard first then you 
have to break your volumes into the number of shards you will maintain and 
then scale accordingly into a number of nodes, or at least as your volumes 
grow you should grow your cluster as well.

It is difficult to predict what problems may arise it is too generic your 
case, what will be the usage of the cluster? what queries you will perform, 
you will mostly do indexing and occasionally querying or you will 
intensively query your data.

Most important you need to  think how you will partition your data, will 
you have one index, multiple index like a logstash approach? or not
Maybe check here: https://www.found.no/foundation/sizing-elasticsearch/

For data more than a year what you will do delete them? Do you afford to 
lose data? Will you keep backups?

IMHO, these are some of the questions you must answer in order to see 
whether such an approach suit your needs. It is hardware, structure and 
partitioning of your data.

Thomas

On Wednesday, 17 September 2014 13:41:55 UTC+3, P Suman wrote:
>
> Hello,
>
>  We are planning to use ES as a primary datastore. 
>
> Here is my usecase
>
> We receive a million transactions per day (all are inserts). 
> Each transaction is around 500KB size, transaction has 10 fields we should 
> be able to search on all 10 fields. 
> We want to keep around 1 yr worth of data, this comes around 180TB
>
> Can you please let me know any problems that might arise if i use elastic 
> search as the primary datastore.
>
>
>
> Regards,
> Suman
>
>
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/0612d5d3-05df-4538-a3f0-e87cd9b3dc49%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Floating point precision in response

2014-09-17 Thread Thomas
Hi,

I have a quick question with regards the response of numeric values. I 
perform an aggregation with the sum aggregation and when I get back the 
response in a curl request the number is shown as follows:


"aggs":{
>   "day_clicks":{
> "sum": {
>"field" : "clicks"
> }
>   }
> }


response

...
> "doc_count": 384,
> "day_clicks": {
>"value": 
> *2.7372883E7*},
> 


If you noticed the *E7* which is the floating point instead of just 
printing the actual number:

...
>  "value": 
> *27372883**...*


Has anyone faced a similar case? At what level is this happening? in 
elastiscearch's response or later. I have noticed in marvel/sense that the 
response is coming in this way and the transformation is happening client 
side. Is there a way to change that in the response of ES?

Thank you very much

Thomas

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/e99a0e58-ea22-49a4-974e-8796c7941c74%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Indexing is becoming slow, what to look for?

2014-09-09 Thread Thomas
By setting this parameter, some additional questions of mine have been 
generated:

By setting indices.memory.index_buffer_size to a specific node and not to 
all nodes of the cluster, will this configuration be taken into account 
from all nodes? Is it going to be cluster wide or only for index operations 
of the specific node? So do I need to set this up to all nodes one by one, 
do a restart and then see the effects?

Finally, if we index data into an index of 10 shards and I have 5 nodes, 
that means that the particular node will index to 2 shards, so 
the indices.memory.index_buffer_size will refer to those specific two 
shards?

Thank you very much

Thomas

On Friday, 5 September 2014 11:44:42 UTC+3, Thomas wrote:
>
> Hi,
>
> I have been performing indexing operations in my elasticsearch cluster for 
> some time now. Suddenly, I have been facing some latency while indexing and 
> I'm trying to find the reason for it. 
>
> Details:
>
> I have a custom process which is uploading every interval a number of logs 
> with bulk API. This process was taking about 5-7 minutes every time. For 
> some reason, the last days I noticed that the exact same procedure, same 
> volumes, takes about 15-20 minutes. While manipulating the data I run 
> update operations through scripting (groovy). My cluster is a set of 5 
> nodes, my first impression was that I need to scale therefore I added an 
> extra node. The problem seemed that it was solved but after a day again I 
> face the same issue. 
>
> Is it possible to give some ideas about what to check, or what seems to be 
> the issue? How is possible to check if a background process is running or 
> creating any issues (expunge etc.)? Does anyone has any similar problems?
>
> Any help appreciated, let me know what info to share
>
> ES version is 1.3.1
> JDK is 1.7
>
> Thanks
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a7e8e0bb-aa71-4210-97db-6e8cd46cd79c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Indexing is becoming slow, what to look for?

2014-09-05 Thread Thomas
Got it thanks

On Friday, 5 September 2014 11:44:42 UTC+3, Thomas wrote:
>
> Hi,
>
> I have been performing indexing operations in my elasticsearch cluster for 
> some time now. Suddenly, I have been facing some latency while indexing and 
> I'm trying to find the reason for it. 
>
> Details:
>
> I have a custom process which is uploading every interval a number of logs 
> with bulk API. This process was taking about 5-7 minutes every time. For 
> some reason, the last days I noticed that the exact same procedure, same 
> volumes, takes about 15-20 minutes. While manipulating the data I run 
> update operations through scripting (groovy). My cluster is a set of 5 
> nodes, my first impression was that I need to scale therefore I added an 
> extra node. The problem seemed that it was solved but after a day again I 
> face the same issue. 
>
> Is it possible to give some ideas about what to check, or what seems to be 
> the issue? How is possible to check if a background process is running or 
> creating any issues (expunge etc.)? Does anyone has any similar problems?
>
> Any help appreciated, let me know what info to share
>
> ES version is 1.3.1
> JDK is 1.7
>
> Thanks
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/5f10aa50-13e1-4dff-b44a-ac9ce4cbf2d6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Indexing is becoming slow, what to look for?

2014-09-05 Thread Thomas
Hi,

I wanted to clarify something from the blog post you mentioned. You specify 
that based on calculations we should "give at most ~512 MB indexing 
buffer per active shard...". What i wanted to ask is what do we mean with 
the term active? Do you mean the primary only or not?

Thank you again

On Friday, 5 September 2014 11:44:42 UTC+3, Thomas wrote:
>
> Hi,
>
> I have been performing indexing operations in my elasticsearch cluster for 
> some time now. Suddenly, I have been facing some latency while indexing and 
> I'm trying to find the reason for it. 
>
> Details:
>
> I have a custom process which is uploading every interval a number of logs 
> with bulk API. This process was taking about 5-7 minutes every time. For 
> some reason, the last days I noticed that the exact same procedure, same 
> volumes, takes about 15-20 minutes. While manipulating the data I run 
> update operations through scripting (groovy). My cluster is a set of 5 
> nodes, my first impression was that I need to scale therefore I added an 
> extra node. The problem seemed that it was solved but after a day again I 
> face the same issue. 
>
> Is it possible to give some ideas about what to check, or what seems to be 
> the issue? How is possible to check if a background process is running or 
> creating any issues (expunge etc.)? Does anyone has any similar problems?
>
> Any help appreciated, let me know what info to share
>
> ES version is 1.3.1
> JDK is 1.7
>
> Thanks
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/901847bc-4e02-465f-a11f-9896e31d0e7f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Indexing is becoming slow, what to look for?

2014-09-05 Thread Thomas
Thx Michael,

I will read the post in detail and let you know for any findings

Thomas.

On Friday, 5 September 2014 11:44:42 UTC+3, Thomas wrote:
>
> Hi,
>
> I have been performing indexing operations in my elasticsearch cluster for 
> some time now. Suddenly, I have been facing some latency while indexing and 
> I'm trying to find the reason for it. 
>
> Details:
>
> I have a custom process which is uploading every interval a number of logs 
> with bulk API. This process was taking about 5-7 minutes every time. For 
> some reason, the last days I noticed that the exact same procedure, same 
> volumes, takes about 15-20 minutes. While manipulating the data I run 
> update operations through scripting (groovy). My cluster is a set of 5 
> nodes, my first impression was that I need to scale therefore I added an 
> extra node. The problem seemed that it was solved but after a day again I 
> face the same issue. 
>
> Is it possible to give some ideas about what to check, or what seems to be 
> the issue? How is possible to check if a background process is running or 
> creating any issues (expunge etc.)? Does anyone has any similar problems?
>
> Any help appreciated, let me know what info to share
>
> ES version is 1.3.1
> JDK is 1.7
>
> Thanks
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/caf966f1-97a3-4b86-b900-dc36dcaa279e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: aggregations

2014-09-05 Thread Thomas
What version of es have you been using, afaik in later versions you can 
control the percentage of heap space to utilize with update settings api, 
try to increase it a bit and see what happens, default is 60%, increase it 
for example to 70%:

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/index-modules-fielddata.html#fielddata-circuit-breaker

T.

On Wednesday, 3 September 2014 19:58:02 UTC+3, navdeep agarwal wrote:
>
> hi ,
>
> i am bit new Elastic search ,while testing on elasticsearch's aggregation 
> feature ,i am always hitting data too large,i understand that aggregations 
> are very memory intensive , so is there any way query in ES where one 
> query's output can be  ingested to aggregation so that number of input to 
> aggregation is limited . i have used filter and querying before 
> aggregations .
>
> i have around 60 GB index on 5 shards .
>
> queries i tried:
>
> GET **/_search
> {
>   "query": {"term": {
> "file_sha2": {
>   "value": ""
> }
>   }}, 
>   
>   "aggs": {
>   "top_filename": {
> "max": {
>   "field": "portalid"
> }
>   }
>   
>   }
> }
>
> ---
>
> GET /_search
> {
>   
> "aggs": {
>   "top filename": {
> "filter": {"term": {
>   "file_sha2": "xx"
> }},
> "aggs": {
>   "top_filename": {
> "max": {
>   "field": "portalid"
> }
>   }
> }
>   }
> }
> 
> 
>   
> }
>
>
> thanks in advance .
>  

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/5ca4244c-972e-4adf-bb1d-1ef2134fcdd7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Indexing is becoming slow, what to look for?

2014-09-05 Thread Thomas
Hi,

I have been performing indexing operations in my elasticsearch cluster for 
some time now. Suddenly, I have been facing some latency while indexing and 
I'm trying to find the reason for it. 

Details:

I have a custom process which is uploading every interval a number of logs 
with bulk API. This process was taking about 5-7 minutes every time. For 
some reason, the last days I noticed that the exact same procedure, same 
volumes, takes about 15-20 minutes. While manipulating the data I run 
update operations through scripting (groovy). My cluster is a set of 5 
nodes, my first impression was that I need to scale therefore I added an 
extra node. The problem seemed that it was solved but after a day again I 
face the same issue. 

Is it possible to give some ideas about what to check, or what seems to be 
the issue? How is possible to check if a background process is running or 
creating any issues (expunge etc.)? Does anyone has any similar problems?

Any help appreciated, let me know what info to share

ES version is 1.3.1
JDK is 1.7

Thanks

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/fed6d402-d4bb-44ea-8de7-d66c2ec5cb91%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Integration testing a native script

2014-07-30 Thread Thomas
I have noticed that you mention native java script so you have implemented 
it as a plugin?
if so try the following in your settings:
   final Settings settings
= settingsBuilder()
 ...

.put("plugin.types", YourPlugin.class.getName())

Thomas


On Wednesday, 30 July 2014 12:31:06 UTC+3, Nick T wrote:
>
> Is there a way to have a native java script accessible in integration 
> tests? In my integration tests I am creating a test node in the /tmp 
> folder. 
>
> I've tried copying the script to /tmp/plugins/scripts but that was quite 
> hopeful and unfortunately does not work.
>
> Desperate for help.
>
> Thanks
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/7baa9562-58be-4f16-9392-9cf07e4e989d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Integration testing a native script

2014-07-30 Thread Thomas
Hi,

I have tried the same approach and it worked for me, meaning to copy the 
script I want to perform an integration test and run my IT.

I do the following steps

1) Setup the required paths for elasticsearch

 final Settings settings
= settingsBuilder()
.put("http.enabled", "true")
.put("path.conf", confDir)
.put("path.data", dataDir)
.put("path.work", workDir)
.put("path.logs", logsDir)


2) copy your scripts to the appropriate location
3) fire up a local node
 node = 
nodeBuilder().local(true).settings(settings).clusterName(nodeName).node();
 node.start();

Maybe you first start the node and then add the script, this might not work 
because i think es does a per minute scan for new scripts and the IT test 
does not allow this to happen, hence you should first copy your script and 
then start the node

Hope it helps

Thomas

On Wednesday, 30 July 2014 12:31:06 UTC+3, Nick T wrote:
>
> Is there a way to have a native java script accessible in integration 
> tests? In my integration tests I am creating a test node in the /tmp 
> folder. 
>
> I've tried copying the script to /tmp/plugins/scripts but that was quite 
> hopeful and unfortunately does not work.
>
> Desperate for help.
>
> Thanks
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/4b257e79-e424-4573-8f12-81e0a95b27b4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Cloud-aws version for 1.3.1 of elasticsearch

2014-07-30 Thread Thomas
Hi,

I wanted to ask whether the version of cloud-aws plugin is 2.1.1 for 
elasticsearch 1.3.1, by looking at the github page:
https://github.com/elasticsearch/elasticsearch-cloud-aws/tree/es-1.3

How come the plugin version for 1.3.1 of elasticserach goes backwards? For 
elasticsearch 1.2.x the version of cloud-aws is 2.2.0.
Is this correct?

Thank you very much
Thomas

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ccb42a1f-a0f0-40ed-81d6-96b0e1b279c8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: 1.1.1 to 1.3 upgrade possible?

2014-07-29 Thread Thomas
Thnx Mark,

I can see that as you mentioned new version 1.3.1 has been released

Thomas

On Monday, 28 July 2014 11:11:57 UTC+3, Thomas wrote:
>
> Hi,
>
> I maintain a working cluster which is in version 1.1.1 and I'm planning to 
> upgrade to version 1.3.0 which is released the previous week. I wanted to 
> ask whether it is compatible to upgrade or whether I will have any known 
> issues/problems, what to expect in general.
>
> Thank you very much
> Thomas
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/3d3cc8f0-b1c8-404f-be98-ff1133c6771d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: 1.1.1 to 1.3 upgrade possible?

2014-07-28 Thread Thomas
Great,

thanks 4 your reply Mark

On Monday, 28 July 2014 11:11:57 UTC+3, Thomas wrote:
>
> Hi,
>
> I maintain a working cluster which is in version 1.1.1 and I'm planning to 
> upgrade to version 1.3.0 which is released the previous week. I wanted to 
> ask whether it is compatible to upgrade or whether I will have any known 
> issues/problems, what to expect in general.
>
> Thank you very much
> Thomas
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/21478ac1-82dc-4945-98ce-e176abddf3b2%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


1.1.1 to 1.3 upgrade possible?

2014-07-28 Thread Thomas
Hi,

I maintain a working cluster which is in version 1.1.1 and I'm planning to 
upgrade to version 1.3.0 which is released the previous week. I wanted to 
ask whether it is compatible to upgrade or whether I will have any known 
issues/problems, what to expect in general.

Thank you very much
Thomas

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c6c5d7e6-150e-4756-9532-9b9d0beee58e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Aggregation on parent/child documents

2014-07-25 Thread Thomas
Hi Adrien and thank you for the reply,

This is exactly what i had in mind alongside with the reversed search 
equivalent with the reverse_nested, this is planed for version 1.4.0 
onwards as i see, will keep track of any updates on this, thanks

Thomas

On Friday, 25 July 2014 14:54:50 UTC+3, Thomas wrote:
>
> Hi,
>
> I wanted to ask whether is possible to perform aggregations combining 
> parent/child documents, something similar with the nested aggregation and 
> the reverse nested aggregation. It would be very helpful to have the 
> ability to create for instance buckets based on parent document fields and 
> get back aggregations that contain fields of both parent and children 
> documents combined.
>
> Any thoughts, future features to be added in the near releases, related to 
> the above?
>
> Thank you
> Thomas
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a6c7dfa1-d8b1-4ce5-8046-73892f74b33e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Aggregation on parent/child documents

2014-07-25 Thread Thomas
Hi,

I wanted to ask whether is possible to perform aggregations combining 
parent/child documents, something similar with the nested aggregation and 
the reverse nested aggregation. It would be very helpful to have the 
ability to create for instance buckets based on parent document fields and 
get back aggregations that contain fields of both parent and children 
documents combined.

Any thoughts, future features to be added in the near releases, related to 
the above?

Thank you
Thomas

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/91d60d52-c538-45b5-8cf0-91cb1e9d9a9a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: elasticsearch init script for centos or rhel ?

2014-07-16 Thread Thomas Kuther
The one from the elasticsearch CentOS rpm repository works fine here on EL6.

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/setup-repositories.html
(there are also 1.0 and 1.1 repos, simply adjust the baseurl)

The source is here:
https://github.com/elasticsearch/elasticsearch/blob/master/src/rpm/init.d/elasticsearch
..but I recommend the rpm from the repo because of /etc/sysconfig and
install locations etc, much easier that way.

~Tom

Am 16.07.2014 09:12, schrieb Aesop Wolf:
> Did you ever find a script that works on CentOS? I'm also looking for one.
>
> On Friday, March 14, 2014 9:18:04 AM UTC-7, Dominic Nicholas wrote:
>
> Thanks. 
> Does anyone know of a version that
> uses  /etc/rc.d/init.d/functions instead of /lib/lsb, that would
> work on CentOS and work with elasticsearch 1.0.1 ?
> Dom
>
> On Friday, March 14, 2014 9:24:12 AM UTC-4, David Pilato wrote:
>
> May be
> this? 
> https://github.com/elasticsearch/elasticsearch/blob/master/src/deb/init.d/elasticsearch
> 
> 
>
> --
> David ;-)
> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
>
>
> Le 14 mars 2014 à 14:19, Dominic Nicholas
>  a écrit :
>
> Hi - can someone please point me to an /etc/init.d script for
> elasticsearch 1.0.1 for CentOS or RHEL ?
>
> Thanks
> -- 
> You received this message because you are subscribed to the
> Google Groups "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from
> it, send an email to elasticsearc...@googlegroups.com.
> To view this discussion on the web visit
> 
> https://groups.google.com/d/msgid/elasticsearch/25064596-595d-4227-be37-d20f267edc5b%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout
> .
>
> -- 
> You received this message because you are subscribed to the Google
> Groups "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to elasticsearch+unsubscr...@googlegroups.com
> .
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/a9e8017c-a565-40a6-944b-8920a591f6d6%40googlegroups.com
> .
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/53C62A62.3040100%40gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Setting id of document with elasticsearch-hadoop that is not in source document

2014-07-11 Thread Brian Thomas
I was just curious if there was a way of doing this without doing this, I 
can add the field if necessary.

For alternatives, what if in addition to es.mapping.id, there is another 
property available also, like es.mapping.id.exlude that will not include 
the id field in the source document.  In elasticsearch, you can create and 
update documents without having to include the id in the source document, 
so I think it would make sense to be able to do that with 
elasticsearch-hadoop also.

On Thursday, July 10, 2014 5:49:18 PM UTC-4, Costin Leau wrote:
>
> You need to specify the id of the document you want to update somehow. 
> Since in es-hadoop things are batch focused, each 
> doc needs its own id specified somehow hence the use of 'es.mapping.id' 
> to indicate its value. 
> Is there a reason why this approach does not work for you - any 
> alternatives that you thought of? 
>
> Cheers, 
>
> On 7/7/14 10:48 PM, Brian Thomas wrote: 
> > I am trying to update an elasticsearch index using elasticsearch-hadoop. 
>  I am aware of the *es.mapping.id* 
> > configuration where you can specify that field in the document to use as 
> an id, but in my case the source document does 
> > not have the id (I used elasticsearch's autogenerated id when indexing 
> the document).  Is it possible to specify the id 
> > to update without having the add a new field to the MapWritable object? 
> > 
> > 
> > -- 
> > You received this message because you are subscribed to the Google 
> Groups "elasticsearch" group. 
> > To unsubscribe from this group and stop receiving emails from it, send 
> an email to 
> > elasticsearc...@googlegroups.com   elasticsearch+unsubscr...@googlegroups.com >. 
> > To view this discussion on the web visit 
> > 
> https://groups.google.com/d/msgid/elasticsearch/ce6161ad-d442-4ffb-9162-114cb8cd76dd%40googlegroups.com
>  
> > <
> https://groups.google.com/d/msgid/elasticsearch/ce6161ad-d442-4ffb-9162-114cb8cd76dd%40googlegroups.com?utm_medium=email&utm_source=footer>.
>  
>
> > For more options, visit https://groups.google.com/d/optout. 
>
> -- 
> Costin 
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/2c6753aa-c459-489b-9f86-6803a5616718%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Setting id of document with elasticsearch-hadoop that is not in source document

2014-07-11 Thread Brian Thomas
I was just curious if there was a way of doing this without doing this, I 
can add the field if necessary.

For alternatives, what if in addition to es.mapping.id, there is another 
property available also, like es.mapping.id.include.in.src where you could 
specify whether the src field actually gets included in the source 
document.  In elasticsearch, you can create and update documents without 
having to include the id in the source document, so I think it would make 
sense to be able to do that with elasticsearch-hadoop also.

On Thursday, July 10, 2014 5:49:18 PM UTC-4, Costin Leau wrote:
>
> You need to specify the id of the document you want to update somehow. 
> Since in es-hadoop things are batch focused, each 
> doc needs its own id specified somehow hence the use of 'es.mapping.id' 
> to indicate its value. 
> Is there a reason why this approach does not work for you - any 
> alternatives that you thought of? 
>
> Cheers, 
>
> On 7/7/14 10:48 PM, Brian Thomas wrote: 
> > I am trying to update an elasticsearch index using elasticsearch-hadoop. 
>  I am aware of the *es.mapping.id* 
> > configuration where you can specify that field in the document to use as 
> an id, but in my case the source document does 
> > not have the id (I used elasticsearch's autogenerated id when indexing 
> the document).  Is it possible to specify the id 
> > to update without having the add a new field to the MapWritable object? 
> > 
> > 
> > -- 
> > You received this message because you are subscribed to the Google 
> Groups "elasticsearch" group. 
> > To unsubscribe from this group and stop receiving emails from it, send 
> an email to 
> > elasticsearc...@googlegroups.com   elasticsearch+unsubscr...@googlegroups.com >. 
> > To view this discussion on the web visit 
> > 
> https://groups.google.com/d/msgid/elasticsearch/ce6161ad-d442-4ffb-9162-114cb8cd76dd%40googlegroups.com
>  
> > <
> https://groups.google.com/d/msgid/elasticsearch/ce6161ad-d442-4ffb-9162-114cb8cd76dd%40googlegroups.com?utm_medium=email&utm_source=footer>.
>  
>
> > For more options, visit https://groups.google.com/d/optout. 
>
> -- 
> Costin 
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/77259ed3-a896-47cc-9304-cc32046756ad%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Setting id of document with elasticsearch-hadoop that is not in source document

2014-07-07 Thread Brian Thomas
I am trying to update an elasticsearch index using elasticsearch-hadoop.  I 
am aware of the *es.mapping.id* configuration where you can specify that 
field in the document to use as an id, but in my case the source document 
does not have the id (I used elasticsearch's autogenerated id when indexing 
the document).  Is it possible to specify the id to update without having 
the add a new field to the MapWritable object?


-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ce6161ad-d442-4ffb-9162-114cb8cd76dd%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: java.lang.NoSuchFieldError: ALLOW_UNQUOTED_FIELD_NAMES when trying to query elasticsearch using spark

2014-07-07 Thread Brian Thomas
Here is the gradle build I was using originally:

apply plugin: 'java'
apply plugin: 'eclipse'

sourceCompatibility = 1.7
version = '0.0.1'
group = 'com.spark.testing'

repositories {
mavenCentral()
}

dependencies {
compile 'org.apache.spark:spark-core_2.10:1.0.0'
compile 'edu.stanford.nlp:stanford-corenlp:3.3.1'
compile group: 'edu.stanford.nlp', name: 'stanford-corenlp', version: 
'3.3.1', classifier:'models'
compile files('lib/elasticsearch-hadoop-2.0.0.jar')
testCompile 'junit:junit:4.+'
testCompile group: "com.github.tlrx", name: "elasticsearch-test", version: 
"1.2.1"
}


When I ran dependencyInsight on jackson, I got the following output:

C:\dev\workspace\SparkProject>gradle dependencyInsight --dependency 
jackson-core

:dependencyInsight
com.fasterxml.jackson.core:jackson-core:2.3.0
\--- com.fasterxml.jackson.core:jackson-databind:2.3.0
 +--- org.json4s:json4s-jackson_2.10:3.2.6
 |\--- org.apache.spark:spark-core_2.10:1.0.0
 | \--- compile
 \--- com.codahale.metrics:metrics-json:3.0.0
  \--- org.apache.spark:spark-core_2.10:1.0.0 (*)

org.codehaus.jackson:jackson-core-asl:1.0.1
\--- org.codehaus.jackson:jackson-mapper-asl:1.0.1
 \--- org.apache.hadoop:hadoop-core:1.0.4
  \--- org.apache.hadoop:hadoop-client:1.0.4
   \--- org.apache.spark:spark-core_2.10:1.0.0
\--- compile

Version 1.0.1 of jackson-core-asl does not have the field 
ALLOW_UNQUOTED_FIELD_NAMES, but later versions of it do.

On Sunday, July 6, 2014 4:28:56 PM UTC-4, Costin Leau wrote:
>
> Hi,
>
> Glad to see you sorted out the problem. Out of curiosity what version of 
> jackson were you using and what was pulling it in? Can you share you maven 
> pom/gradle build?
>
>
> On Sun, Jul 6, 2014 at 10:27 PM, Brian Thomas  > wrote:
>
>> I figured it out, dependency issue in my classpath.  Maven was pulling 
>> down a very old version of the jackson jar.  I added the following line to 
>> my dependencies and the error went away:
>>
>> compile 'org.codehaus.jackson:jackson-mapper-asl:1.9.13'
>>
>>
>> On Friday, July 4, 2014 3:22:30 PM UTC-4, Brian Thomas wrote:
>>>
>>>  I am trying to test querying elasticsearch using Apache Spark using 
>>> elasticsearch-hadoop.  I am just trying to do a query to the elasticsearch 
>>> server and return the count of results.
>>>
>>> Below is my test class using the Java API:
>>>
>>> import org.apache.hadoop.conf.Configuration;
>>> import org.apache.hadoop.io.MapWritable;
>>> import org.apache.hadoop.io.Text;
>>> import org.apache.spark.SparkConf;
>>> import org.apache.spark.api.java.JavaPairRDD;
>>> import org.apache.spark.api.java.JavaSparkContext;
>>> import org.apache.spark.serializer.KryoSerializer;
>>> import org.elasticsearch.hadoop.mr.EsInputFormat;
>>>
>>> import scala.Tuple2;
>>>
>>> public class ElasticsearchSparkQuery{
>>>
>>> public static int query(String masterUrl, String 
>>> elasticsearchHostPort) {
>>> SparkConf sparkConfig = new SparkConf().setAppName("
>>> ESQuery").setMaster(masterUrl);
>>> sparkConfig.set("spark.serializer", 
>>> KryoSerializer.class.getName());
>>> JavaSparkContext sparkContext = new 
>>> JavaSparkContext(sparkConfig);
>>>
>>> Configuration conf = new Configuration();
>>> conf.setBoolean("mapred.map.tasks.speculative.execution", 
>>> false);
>>> conf.setBoolean("mapred.reduce.tasks.speculative.execution", 
>>> false);
>>> conf.set("es.nodes", elasticsearchHostPort);
>>> conf.set("es.resource", "media/docs");
>>> conf.set("es.query", "?q=*");
>>>
>>> JavaPairRDD esRDD = 
>>> sparkContext.newAPIHadoopRDD(conf, EsInputFormat.class, Text.class,
>>> MapWritable.class);
>>> return (int) esRDD.count();
>>> }
>>> }
>>>
>>>
>>> When I try to run this I get the following error:
>>>
>>>
>>> 4/07/04 14:58:07 INFO executor.Executor: Running task ID 0
>>> 14/07/04 14:58:07 INFO storage.BlockManager: Found block broadcast_0 
>>> locally
>>> 14/07/04 14:58:07 INFO rdd.NewHadoopRDD: Input split: ShardInputSplit 
>>> [node=[5UATWUzmTUuNzhmGxXWy_w/S'byll|10.45.71.152:9

Re: java.lang.NoSuchFieldError: ALLOW_UNQUOTED_FIELD_NAMES when trying to query elasticsearch using spark

2014-07-06 Thread Brian Thomas
I figured it out, dependency issue in my classpath.  Maven was pulling down 
a very old version of the jackson jar.  I added the following line to my 
dependencies and the error went away:

compile 'org.codehaus.jackson:jackson-mapper-asl:1.9.13'

On Friday, July 4, 2014 3:22:30 PM UTC-4, Brian Thomas wrote:
>
>  I am trying to test querying elasticsearch using Apache Spark using 
> elasticsearch-hadoop.  I am just trying to do a query to the elasticsearch 
> server and return the count of results.
>
> Below is my test class using the Java API:
>
> import org.apache.hadoop.conf.Configuration;
> import org.apache.hadoop.io.MapWritable;
> import org.apache.hadoop.io.Text;
> import org.apache.spark.SparkConf;
> import org.apache.spark.api.java.JavaPairRDD;
> import org.apache.spark.api.java.JavaSparkContext;
> import org.apache.spark.serializer.KryoSerializer;
> import org.elasticsearch.hadoop.mr.EsInputFormat;
>
> import scala.Tuple2;
>
> public class ElasticsearchSparkQuery{
>
> public static int query(String masterUrl, String 
> elasticsearchHostPort) {
> SparkConf sparkConfig = new 
> SparkConf().setAppName("ESQuery").setMaster(masterUrl);
> sparkConfig.set("spark.serializer", 
> KryoSerializer.class.getName());
> JavaSparkContext sparkContext = new JavaSparkContext(sparkConfig);
>
> Configuration conf = new Configuration();
> conf.setBoolean("mapred.map.tasks.speculative.execution", false);
> conf.setBoolean("mapred.reduce.tasks.speculative.execution", 
> false);
> conf.set("es.nodes", elasticsearchHostPort);
> conf.set("es.resource", "media/docs");
> conf.set("es.query", "?q=*");
>
> JavaPairRDD esRDD = 
> sparkContext.newAPIHadoopRDD(conf, EsInputFormat.class, Text.class,
> MapWritable.class);
> return (int) esRDD.count();
> }
> }
>
>
> When I try to run this I get the following error:
>
>
> 4/07/04 14:58:07 INFO executor.Executor: Running task ID 0
> 14/07/04 14:58:07 INFO storage.BlockManager: Found block broadcast_0 
> locally
> 14/07/04 14:58:07 INFO rdd.NewHadoopRDD: Input split: ShardInputSplit 
> [node=[5UATWUzmTUuNzhmGxXWy_w/S'byll|10.45.71.152:9200],shard=0]
> 14/07/04 14:58:07 WARN mr.EsInputFormat: Cannot determine task id...
> 14/07/04 14:58:07 ERROR executor.Executor: Exception in task ID 0
> java.lang.NoSuchFieldError: ALLOW_UNQUOTED_FIELD_NAMES
> at 
> org.elasticsearch.hadoop.serialization.json.JacksonJsonParser.(JacksonJsonParser.java:38)
> at 
> org.elasticsearch.hadoop.serialization.ScrollReader.read(ScrollReader.java:75)
> at 
> org.elasticsearch.hadoop.rest.RestRepository.scroll(RestRepository.java:267)
> at 
> org.elasticsearch.hadoop.rest.ScrollQuery.hasNext(ScrollQuery.java:75)
> at 
> org.elasticsearch.hadoop.mr.EsInputFormat$ShardRecordReader.next(EsInputFormat.java:319)
> at 
> org.elasticsearch.hadoop.mr.EsInputFormat$ShardRecordReader.nextKeyValue(EsInputFormat.java:255)
> at 
> org.apache.spark.rdd.NewHadoopRDD$$anon$1.hasNext(NewHadoopRDD.scala:122)
> at 
> org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39)
> at org.apache.spark.util.Utils$.getIteratorSize(Utils.scala:1014)
> at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:847)
> at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:847)
> at 
> org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1080)
> at 
> org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1080)
> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111)
> at org.apache.spark.scheduler.Task.run(Task.scala:51)
> at 
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:187)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
>
> Has anyone run into this issue with the JacksonJsonParser?
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/9c2b2f2e-5196-4a72-bfbc-4cd0fda9edf0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


java.lang.NoSuchFieldError: ALLOW_UNQUOTED_FIELD_NAMES when trying to query elasticsearch using spark

2014-07-04 Thread Brian Thomas
 I am trying to test querying elasticsearch using Apache Spark using 
elasticsearch-hadoop.  I am just trying to do a query to the elasticsearch 
server and return the count of results.

Below is my test class using the Java API:

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.io.MapWritable;
import org.apache.hadoop.io.Text;
import org.apache.spark.SparkConf;
import org.apache.spark.api.java.JavaPairRDD;
import org.apache.spark.api.java.JavaSparkContext;
import org.apache.spark.serializer.KryoSerializer;
import org.elasticsearch.hadoop.mr.EsInputFormat;

import scala.Tuple2;

public class ElasticsearchSparkQuery{

public static int query(String masterUrl, String elasticsearchHostPort) 
{
SparkConf sparkConfig = new 
SparkConf().setAppName("ESQuery").setMaster(masterUrl);
sparkConfig.set("spark.serializer", KryoSerializer.class.getName());
JavaSparkContext sparkContext = new JavaSparkContext(sparkConfig);

Configuration conf = new Configuration();
conf.setBoolean("mapred.map.tasks.speculative.execution", false);
conf.setBoolean("mapred.reduce.tasks.speculative.execution", false);
conf.set("es.nodes", elasticsearchHostPort);
conf.set("es.resource", "media/docs");
conf.set("es.query", "?q=*");

JavaPairRDD esRDD = 
sparkContext.newAPIHadoopRDD(conf, EsInputFormat.class, Text.class,
MapWritable.class);
return (int) esRDD.count();
}
}


When I try to run this I get the following error:


4/07/04 14:58:07 INFO executor.Executor: Running task ID 0
14/07/04 14:58:07 INFO storage.BlockManager: Found block broadcast_0 locally
14/07/04 14:58:07 INFO rdd.NewHadoopRDD: Input split: ShardInputSplit 
[node=[5UATWUzmTUuNzhmGxXWy_w/S'byll|10.45.71.152:9200],shard=0]
14/07/04 14:58:07 WARN mr.EsInputFormat: Cannot determine task id...
14/07/04 14:58:07 ERROR executor.Executor: Exception in task ID 0
java.lang.NoSuchFieldError: ALLOW_UNQUOTED_FIELD_NAMES
at 
org.elasticsearch.hadoop.serialization.json.JacksonJsonParser.(JacksonJsonParser.java:38)
at 
org.elasticsearch.hadoop.serialization.ScrollReader.read(ScrollReader.java:75)
at 
org.elasticsearch.hadoop.rest.RestRepository.scroll(RestRepository.java:267)
at 
org.elasticsearch.hadoop.rest.ScrollQuery.hasNext(ScrollQuery.java:75)
at 
org.elasticsearch.hadoop.mr.EsInputFormat$ShardRecordReader.next(EsInputFormat.java:319)
at 
org.elasticsearch.hadoop.mr.EsInputFormat$ShardRecordReader.nextKeyValue(EsInputFormat.java:255)
at 
org.apache.spark.rdd.NewHadoopRDD$$anon$1.hasNext(NewHadoopRDD.scala:122)
at 
org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39)
at org.apache.spark.util.Utils$.getIteratorSize(Utils.scala:1014)
at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:847)
at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:847)
at 
org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1080)
at 
org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1080)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111)
at org.apache.spark.scheduler.Task.run(Task.scala:51)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:187)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)

Has anyone run into this issue with the JacksonJsonParser?

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/9da5ae25-3e57-4c24-ab45-c62c987ebec0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Corss-index parent/child relationship

2014-06-26 Thread Thomas
Hi,

Unfortunately this is not supported by elasticsearch, the parent document 
and the child document must be under the same index or else the rounting 
will not be established. You can either try coping the parent document if 
they are not many or you can use an other way to split your data like with 
a hash function and to ensure that both parent and child document will be 
indexed into the same index.

Hope it helps
Thomas

On Wednesday, 25 June 2014 04:48:48 UTC+3, Drew wrote:
>
> Hi! 
>
> Does ES support cross-index parent/child relationship? More specifically, 
> can I have all the parents in one index (say users) and the children (say 
> events) in a multiple time series style (managed by curator) indices? If 
> so, how is this done? If not, what’s the alternative? 
>
> Thanks, 
>
> Drew

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/d1d73ca5-8d7f-4515-83bb-87f956f5fd83%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Proper parsing of String values like 1m, 1q HOUR etc.

2014-06-25 Thread Thomas
Hi,

Thanks again for your time, what I'm trying to do is to be able to generate 
for example the time in milliseconds in the same way the elasticsearch core 
does when for example you passes into the date histogram the 1q 
configuration. I'm trying to simulate in a way the interval of date 
histogram without using the date histogram, if possible. What is the one 
liner of code (if you can say that) that does this transformation of 1q 
into milliseconds and elasticsearch is able to give the intervals of date 
histogram, because then i compare time intervals with date histogram and I 
want to be exactly the same.

And please allow me to make one more question, since elasticsearch uses 
joda start of week is considered always Monday? independently of the 
timezone?

Thanks!!

Thomas

On Tuesday, 17 June 2014 18:31:37 UTC+3, Thomas wrote:
>
> Hi,
>
> I was wondering whether there is a proper Utility class to parse the given 
> values and get the duration in milliseconds probably for values such as 1m 
> (which means 1 minute) 1q (which means 1 quarter) etc.
>
> I have found that elasticsearch utilizes class TimeValue but it only 
> parses up to week, and values such as WEEK, HOUR are not accepted. So is in 
> elasticsearch source any utility class that does the job ? (for Histograms, 
> ranges wherever is needed)
>
> Thank you
> Thomas
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/bbff2784-c06d-4944-9887-0147e9e31a5e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Aggregation Framework, possible to get distribution of requests per user

2014-06-24 Thread Thomas
My mistake sorry,

Here is an example:

I have the request document:

"request":{
 "dynamic" : "strict",
 "properties" : {
"time" : {
  "format" : "dateOptionalTime",
  "type" : "date"
},
"user_id" : {
   "index" : "not_analyzed",
   "type" : "string"
},
"country" : {
   "index" : "not_analyzed",
   "type" : "string"
}
  }
}

I want to find the number of (unique) user_ids that have X number of 
documents, e.g. for country US, and ideally I need the full list e.g.:


1000 users have 43 documents
..
100 users have 234 documents
150 users have 500 documents
etc..

In other words the distribution of documents (requests) per unique user 
count, of course I can understand that it is a pretty heavy operation in 
terms of memory, but we may limit to the top 100 rows for instance, or if 
we can workaround it.

Thanks again for your time
Thomas

On Tuesday, 24 June 2014 13:32:13 UTC+3, Thomas wrote:
>
> Hi,
>
> I wanted to ask whether it is possible to get with the aggregation 
> framework the distribution of one specific type of documents sent per user, 
> I'm interested for occurrences of documents per user, e.g. :
>
> 1000 users sent 1 document 
> 500 ussers  sent 2 documents
> X number of unique users sent Y documents (each)
> etc.
>
> on each document i index the user_id
>
> Is there a way to support such a query, or partially support it? get the 
> first 10 rows of this type of list not the exhaustive list. Can you give me 
> some hint? 
>
> Thanks
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/e07561ed-7f1b-4e98-8a8d-16e410324cc2%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Aggregation Framework, possible to get distribution of requests per user

2014-06-24 Thread Thomas
Hi David 

Thank you for your reply, so based on your suggestion I should maintain a 
document (e.g. user) with some aggregated values and I should update it as 
we move along with our indexing of our data, correct?

This though would only give me totals. I cannot apply something like a 
range. I found as well a similar discussion here 
https://groups.google.com/forum/#!msg/elasticsearch/UsrCG2Abj-A/IDO9DX_PoQwJ. 
Maybe something similar with the terms and histogram aggregation could 
support this logic like instead of giving :

{
"aggs" : {
"requests_distribution" : {
"distribution" : {
"field" : "user_id",
"interval" : 50
}
}
}
}

and the result could be:

{
"aggregations": {
"requests_distribution" : {
"buckets": [
{
"key": 0,
"doc_count": 2
},
{
"key": 50,
"doc_count": 400
},
{
"key": 150,
"doc_count": 30
}
    ]
    }
}
}

Where the key represents a unique number of users like for 0 to 50 users 
have 2 documents per user etc.

Just an idea

Thanks
Thomas

On Tuesday, 24 June 2014 13:32:13 UTC+3, Thomas wrote:
>
> Hi,
>
> I wanted to ask whether it is possible to get with the aggregation 
> framework the distribution of one specific type of documents sent per user, 
> I'm interested for occurrences of documents per user, e.g. :
>
> 1000 users sent 1 document 
> 500 ussers  sent 2 documents
> X number of unique users sent Y documents (each)
> etc.
>
> on each document i index the user_id
>
> Is there a way to support such a query, or partially support it? get the 
> first 10 rows of this type of list not the exhaustive list. Can you give me 
> some hint? 
>
> Thanks
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ae8b56f1-a783-4ade-b948-079f6457ae27%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Aggregation Framework, possible to get distribution of requests per user

2014-06-24 Thread Thomas
Hi,

I wanted to ask whether it is possible to get with the aggregation 
framework the distribution of one specific type of documents sent per user, 
I'm interested for occurrences of documents per user, e.g. :

1000 users sent 1 document 
500 ussers  sent 2 documents
X number of unique users sent Y documents (each)
etc.

on each document i index the user_id

Is there a way to support such a query, or partially support it? get the 
first 10 rows of this type of list not the exhaustive list. Can you give me 
some hint? 

Thanks

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c9e7e543-372c-4441-9cac-e7c0f259ed4e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Proper parsing of String values like 1m, 1q HOUR etc.

2014-06-24 Thread Thomas
Hi Brian,

Thanks for your reply, I understand your point but if you check the source 
code of TimeValue it does not support the quarter and the year so I was 
wondering what is the class that supports the transformation of the string 
1q into millisecods or 1y into millisecods if any

Thanks

On Tuesday, 17 June 2014 18:31:37 UTC+3, Thomas wrote:
>
> Hi,
>
> I was wondering whether there is a proper Utility class to parse the given 
> values and get the duration in milliseconds probably for values such as 1m 
> (which means 1 minute) 1q (which means 1 quarter) etc.
>
> I have found that elasticsearch utilizes class TimeValue but it only 
> parses up to week, and values such as WEEK, HOUR are not accepted. So is in 
> elasticsearch source any utility class that does the job ? (for Histograms, 
> ranges wherever is needed)
>
> Thank you
> Thomas
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/9666f856-7327-4e97-8185-de603f02aee6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Splunk vs. Elastic search performance?

2014-06-19 Thread Thomas Paulsen
We had a 2,2TB/d installation of Splunk and ran it on VMWare with 12 
Indexer and 2 Searchheads. Each indexer had 1000IOPS guaranteed assigned. 
The system is slow but ok to use. 

We tried Elasticsearch and we were able to get the same performance with 
the same amount of machines. Unfortunately with Elasticsearch you need 
almost double amount of storage, plus a LOT of patience to make is run. It 
took us six months to set it up properly, and even now, the system is quite 
buggy and instable and from time to time we loose data with Elasticsearch. 

I don´t recommend ELK for a critical production system, for just dev work, 
it is ok, if you don´t mind the hassle of setting up and operating it. The 
costs you save by not buying a splunk license you have to invest into 
consultants to get it up and running. Our dev teams hate Elasticsearch and 
prefer Splunk.

Am Samstag, 19. April 2014 00:07:44 UTC+2 schrieb Mark Walkom:
>
> That's a lot of data! I don't know of any installations that big but 
> someone else might.
>
> What sort of infrastructure are you running splunk on now, what's your 
> current and expected retention?
>
> Regards,
> Mark Walkom
>
> Infrastructure Engineer
> Campaign Monitor
> email: ma...@campaignmonitor.com 
> web: www.campaignmonitor.com
>
>
> On 19 April 2014 07:33, Frank Flynn > 
> wrote:
>
>> We have a large Splunk instance.  We load about 1.25 Tb of logs a day. 
>>  We have about 1,300 loaders (servers that collect and load logs - they may 
>> do other things too).
>>
>> As I look at Elasticsearch / Logstash / Kibana does anyone know of a 
>> performance comparison guide?  Should I expect to run on very similar 
>> hardware?  More? or Less?
>>
>> Sure it depends on exactly what we're doing, the exact queries and the 
>> frequency we'd run them but I'm trying to get any kind of idea before we 
>> start.
>>
>> Are there any white papers or other documents about switching?  It seems 
>> an obvious choice but I can only find very little performance comparisons 
>> (I did see that Elasticsearch just hired "the former VP of Products at 
>> Splunk, Gaurav Gupta" - but there were few numbers in that article either).
>>
>> Thanks,
>> Frank
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/ea1a338b-5b44-485d-84b2-3558a812e8a0%40googlegroups.com
>>  
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/32c23e38-2a2f-4c09-a76d-6a824edb1b85%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Proper parsing of String values like 1m, 1q HOUR etc.

2014-06-17 Thread Thomas
Hi,

I was wondering whether there is a proper Utility class to parse the given 
values and get the duration in milliseconds probably for values such as 1m 
(which means 1 minute) 1q (which means 1 quarter) etc.

I have found that elasticsearch utilizes class TimeValue but it only parses 
up to week, and values such as WEEK, HOUR are not accepted. So is in 
elasticsearch source any utility class that does the job ? (for Histograms, 
ranges wherever is needed)

Thank you
Thomas

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/12ff6fbe-4d0e-4e8a-aa74-356311512b3e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Need help, multiple aggregations with filters extremely slow, where to look for optimizations?

2014-06-13 Thread Thomas
So I restructured my curl as follows, is this what you mean?, by doing some 
first hits i do get some slight improvement, but need to check into 
production data:

Thank you will try it and come back with results

curl -XPOST 
"http://10.129.2.42:9200/logs-idx.20140613/event/_search?search_type=count"; 
-d'
{
  "query": {
"filtered": {
  "filter": {
"or": [
  {
"and": [
  {
"has_parent": {
  "type": "request",
  "filter": {
"and": {
  "filters": [
{
  "term": {
"country": "US"
  }
},
{
  "term": {
"city": "NY"
  }
},
{
  "term": {
"code": 12
  }
}
  ]
}
  }
}
  },
  {
"range": {
  "event_time": {
"gte": "2014-06-13T10:00:00",
"lt": "2014-06-13T11:00:00"
  }
}
  }
]
  },
  {
"and": [
  {
"has_parent": {
  "type": "request",
  "filter": {
"and": {
  "filters": [
{
  "term": {
"country": "US"
  }
},
{
  "term": {
"city": "NY"
  }
},
{
  "term": {
"code": 12
  }
},
{
  "range": {
"request_time": {
  "gte": "2014-06-13T10:00:00",
  "lt": "2014-06-13T11:00:00"
}
  }
}
  ]
}
  }
}
  },
  {
"range": {
  "event_time": {
"lt": "2014-06-13T10:00:00"
  }
}
  }
]
  }
]
  }
}
  },
  "aggs": {
"per_interval": {
  "date_histogram": {
"field": "event_time",
"interval": "minute"
  },
  "aggs": {
"metrics": {
  "terms": {
"field": "event",
"size": 12
  }
}
  }
}
  }
}'


On Friday, 13 June 2014 10:09:46 UTC+3, Thomas wrote:
>
> Hi,
>
> I'm facing a performance issue with some aggregations I perform, and I 
> need your help if possible:
>
> I have to documents, the *request* and the *event*. The request is the 
> parent of the event. Below is a (sample) mapping
>
> "event" : {
> "dynamic" : "strict",
> "_parent" : {
>"type" : "request"
> },
> "properties" : {
>"event_time" : {
> "format" : "dateOptionalTime",
> "type" : "date"
>},
>"count" : {
>   "type" : "integer"
> },
> "event" : {
> "index" : "not_analyzed",
> "type" : "string"
> }
>  }
> }
>
> "request" : {
> "dynamic" : "strict",
>  "_id" : {
>"path" : "uniqueId"
>  },
>  "properties" : {
> "uniqueId" : {
>  "index" : "not_analyzed",
>  "type" : 

Re: Need help, multiple aggregations with filters extremely slow, where to look for optimizations?

2014-06-13 Thread Thomas
Below is an example aggregation i perform, is there any optimizations I can 
perform? Maybe disabling some features i do not need etc.

curl -XPOST 
"http://localhost:9200/logs-idx.20140613/event/_search?search_type=count"; -d
'
{
  "aggs": {
"f1": {
  "filter": {
"or": [
  {
"and": [
  {
"has_parent": {
  "type": "request",
  "filter": {
"and": {
  "filters": [
{
  "term": {
"country": "US"
  }
},
{
  "term": {
"city": "NY"
  }
},
{
  "term": {
"code": 12
  }
}
  ]
}
  }
}
  },
  {
"range": {
  "event_time": {
"gte": "2014-06-13T10:00:00",
"lt": "2014-06-13T11:00:00"
  }
}
  }
]
  },
  {
"and": [
  {
"has_parent": {
  "type": "request",
  "filter": {
"and": {
  "filters": [
{
  "term": {
"country": "US"
  }
},
{
  "term": {
"city": "NY"
  }
},
{
  "term": {
"code": 12
  }
},
{
  "range": {
"request_time": {
  "gte": "2014-06-13T10:00:00",
  "lt": "2014-06-13T11:00:00"
}
  }
}
  ]
}
  }
}
  },
  {
"range": {
  "event_time": {
"lt": "2014-06-13T10:00:00"
  }
}
  }
]
  }
]
  },
  "aggs": {
"per_interval": {
  "date_histogram": {
"field": "event_time",
"interval": "minute"
  },
  "aggs": {
"metrics": {
  "terms": {
"field": "event",
"size": 10
  }
}
  }
}
  }
}
  }
}'


On Friday, 13 June 2014 10:09:46 UTC+3, Thomas wrote:
>
> Hi,
>
> I'm facing a performance issue with some aggregations I perform, and I 
> need your help if possible:
>
> I have to documents, the *request* and the *event*. The request is the 
> parent of the event. Below is a (sample) mapping
>
> "event" : {
> "dynamic" : "strict",
> "_parent" : {
>"type" : "request"
> },
> "properties" : {
>"event_time" : {
> "format" : "dateOptionalTime",
> "type" : "date"
>},
>"count" : {
>   "type" : "integer"
> },
> "event" : {
> "index" : "not_analyzed",
> "type" : "string"
> }
>  }
> }
>
> "request" : {
> "dynamic" : "strict",
>  "_id" : {
>"path" : "uniqueId"
>  },
>  "properties" : {
> "uniqueId" : {
>  "index" : "not_analyzed",
>  "type" : "string"
>

Need help, multiple aggregations with filters extremely slow, where to look for optimizations?

2014-06-13 Thread Thomas
Hi,

I'm facing a performance issue with some aggregations I perform, and I need 
your help if possible:

I have to documents, the *request* and the *event*. The request is the 
parent of the event. Below is a (sample) mapping

"event" : {
"dynamic" : "strict",
"_parent" : {
   "type" : "request"
},
"properties" : {
   "event_time" : {
"format" : "dateOptionalTime",
"type" : "date"
   },
   "count" : {
  "type" : "integer"
},
"event" : {
"index" : "not_analyzed",
"type" : "string"
}
 }
}

"request" : {
"dynamic" : "strict",
 "_id" : {
   "path" : "uniqueId"
 },
 "properties" : {
"uniqueId" : {
 "index" : "not_analyzed",
 "type" : "string"
},
"user" : {
 "index" : "not_analyzed",
 "type" : "string"
},
   "code" : {
  "type" : "integer"
   },
   "country" : {
 "index" : "not_analyzed",
 "type" : "string"
   },
   "city" : {
 "index" : "not_analyzed",
 "type" : "string"
   }
  
}
}

My cluster is becoming really big (almost 2 TB of data with billions of 
documents) and i maintain one index per day, whereas I occasionally delete 
old indices. My daily index is about 20GB big. The version of elasticsearch 
that I use is 1.1.1. 

My problems start when I want to get some aggregations of events with some 
criteria which is applied in the parent request document. For example count 
be the events of type *click for country = US and code=12. What I was 
initially doing was to generate a scriptFilter for the request document (in 
Groovy) and I was adding multiple aggregations in one search request. This 
ended up being very slow so I removed the scripting logic and I supported 
my logic with java code.*

What seems to be initially solved in my local machine, when I got back to 
the cluster, nothing has changed. Again my app performs really really poor. 
I get more than 10 seconds to perform a search with ~10 sub-aggregations.

What seems strange is that I notice that the cluster is pretty ok with 
regards load average, CPU etc. 

Any hints on where to look for solving this out? to be able to identify the 
bottleneck

*Ask for any additional information to provide*, I didn't want to make this 
post too long to read
Thank you

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/8134f5b0-f947-406f-ab57-c44c6c82ce66%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: How to map a dynamic map of key values?

2014-06-11 Thread Thomas
one way would be to use a nested document structure like:
check: 
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-nested-type.html#mapping-nested-type
..
"properties" : {
   "type":"nested"
   "attributes":{
   "key" : {
   "index" : "not_analyzed",
   "type" : "string"
 },
  "value" : {
  "index":"not_analyzed"
  "type" : "string"
  }
}
}
...
You may even use multiple fields depending of the type of the value, and if 
you know the type of each key to be able to index it properly and do search 
upon it

"properties" : {
   "type":"nested"
   "attributes":{
   "key" : {
   "index" : "not_analyzed",
   "type" : "string"
 },
  "stringValue" : {
  "index":"not_analyzed"
  "type" : "string"
  },
  "numericValue" : {
  "type" : "float"
  }
}
}

How are you going to update the document? If you chose this way and you 
perform updates on a nested document(e.g. add a key value) the whole 
document is reindexed

Hope it helps

Thomas

On Wednesday, 11 June 2014 04:09:16 UTC+3, Drew wrote:
>
> Hey Guys, 
>
> How can I map an arbitrary map of key/values in ES? My JSON looks like the 
> following, where “name" and “age” are static but “attributes” is dynamic: 
>
> { 
>   “name”: “john”, 
>   “age”: 25, 
>   “attributes” : { 
> “key1”: value1, 
> “key2”: value2, 
> “key3”: value3, 
> ... 
>   } 
> } 
>
> Things to consider: 
> 1. Not all documents have the same number of attributes or even the same 
> attributes. 
> 2. Different documents can have values of different types for the same 
> name attributes (say “attr1” is string for doc 1 but is int for doc 2) 
> 3. Attribute value can only be primitive json type (boolean, integer, 
> number, null or string) or an array of primitive type. 
> 4. It goes without saying that the attributes must be searchable. 
>
> Thanks, 
>
> Drew

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/fb125974-b7a8-461b-9701-fea71cf38855%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Indexing nonstandard geo_point field.

2014-06-01 Thread Brian Thomas
I looked at the documenation for elasticsearch's geo_shape and it looks 
like that use [longitude, latitude]

Found this node on the geo_shape documentation page 
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-geo-shape-type.html

Note: In GeoJSON, and therefore Elasticsearch, the correct*coordinate order 
is longitude, latitude (X, Y)* within coordinate arrays. This differs from 
many Geospatial APIs (e.g., Google Maps) that generally use the colloquial 
latitude, longitude (Y, X).


An alternative I found was to use the computed fields plugin
https://github.com/SkillPages/elasticsearch-computed-fields

and create a mapping like this:

"@coordinates-str" : {
 "type" : "computed",
"script" : "_source.geo.coordinates[0] + ',' + 
_source.geo.coordinates[1]",
  "result" : {
  "type" : "geo_point",
  "store" : true
 }
 }

This seems to create the string in the correct format for the geo point. 
 The issue I am having with this method right now is that Elasticsearch 
will return an error if the source document does not have the 
geo.coordinates field.  




On Sunday, June 1, 2014 4:28:24 PM UTC-4, Alexander Reelsen wrote:
>
> Hey,
>
> you could index this as a geo shape (as this is valid GeoJSON). If you 
> really need the functionality for a geo_point, you need to change the 
> structure of the data.
>
>
> --Alex
>
>
> On Sat, May 31, 2014 at 3:36 PM, Brian Thomas  > wrote:
>
>> I am new to Elasticsearch and I am trying to index a json document with a 
>> nonstandard lat/long format.
>>
>> I know the standard format for a geo_point array is [lon, lat], but the 
>> documents I am indexing has format [lat, lon].  
>>
>> This is what the JSON element looks like:
>>
>> "geo": {
>>   "type": "Point",
>>   "coordinates": [
>> 38.673459,
>> -77.336781
>>   ]
>> }
>>
>> Is there anyway I could have elasticsearch reorder this array or convert 
>> this field to a string without having to modify the source document prior 
>> to indexing? Could this be done using a field mapping or script in 
>> elasticsearch?
>>
>>
>>  -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/0e688310-5777-4906-889e-cd77693c3908%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/elasticsearch/0e688310-5777-4906-889e-cd77693c3908%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/48b0ae24-aaa8-4a05-9690-23032974da31%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Indexing nonstandard geo_point field.

2014-05-31 Thread Brian Thomas
I am new to Elasticsearch and I am trying to index a json document with a 
nonstandard lat/long format.

I know the standard format for a geo_point array is [lon, lat], but the 
documents I am indexing has format [lat, lon].  

This is what the JSON element looks like:

"geo": {
  "type": "Point",
  "coordinates": [
38.673459,
-77.336781
  ]
}

Is there anyway I could have elasticsearch reorder this array or convert 
this field to a string without having to modify the source document prior 
to indexing? Could this be done using a field mapping or script in 
elasticsearch?


-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/0e688310-5777-4906-889e-cd77693c3908%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Question about week granularity elasticsearch uses

2014-05-23 Thread Thomas
Hello,

I stepped into a situation where I need to truncate a timestamp field 
truncated to the week, and i want to do it the exact way elasticsearch does 
it in the datehistogram aggregation in order to be able to perform 
comparisons. Does anyone knows how I should perform the truncate to the 
week? I notice that datehistogram returns the beginning of the week 
(MONDAY), is it safe to use the Calendar way as follows?

Calendar cal = Calendar.getInstance();
> cal.setTimeZone(TimeZone.getTimeZone("GMT"));
> // cal.set(Calendar.DAY_OF_WEEK, cal.getFirstDayOfWeek());
> cal.set(Calendar.DAY_OF_WEEK, Calendar.MONDAY);
> Date time = cal.getTime();
> System.out.println("time = " + time);


Is the first day of week depends on the Locale in elasticsearch or not?

Thank you

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/4d763e87-d1ab-491f-af7e-8a0b4ba71d39%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Clear cache on demand and Circuit breaker Problem

2014-05-13 Thread Thomas
Hi,

I'm trying to get some aggregated information by Querying Elasticsearch via 
my app. What I notice is that after some time I get a CircuitBreaker 
exception and my query fails. I can assume that I load too many fielddata 
and eventually the CircuitBreaker stops my query. Inside my application I 
have a logic where I do the query sequentially by time. For example I split 
the query with a range if two hours to every half hour, hence instead of 
doing one query for the full two hours period I do 4 queries of half hour 
each. And this is something I can configure.

My Question is whether it has a meaning to perform a clearCache request 
between my requests (or every 15 minutes for instance) in order to avoid 
CircuitBreaker exception. I know I will make it slower but to my mind it is 
better to perform a bit poorly rather than stopping the operation. Knowing 
that the query remains the same (with different parameters) does this have 
a meaning ? or I will end up deleting and creating the same cache again and 
again?

client.admin().indices().prepareClearCache(indexName).get();


Other alternatives here to avoid circuitbreaker in the most efficient way? 
Of course if I leave it unbounded I eventually get a heap space exception..

Thank you

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/d093c6df-b40b-4704-b0dd-c6bc300299c3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Scripts reload on demand

2014-05-08 Thread Thomas
Hi,

I was wondering whether there is a way to reload the scripts on demand 
provided under config/scripts. I'm facing a weird situation were although 
the documentation describes that the scripts are loaded every xx amount of 
time (configuration) I do not see that happening and there is no way to see 
a new script I put unless I restart my node(s). Is there a curl request to 
be able to force reload the scripts? Additionally, is there any curl 
command that can display which scripts are loaded into ES Node and which 
are not?

I use elasticsearch 1.1.1 and my scripts are in Groovy (with groovy lang 
plugin installed)

Thank you

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/51e0da62-8934-4e67-9fb8-792353f532da%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: access parent bucket's key from child aggregation in geohash grid

2014-05-02 Thread Thomas Gruner
Thank-you Adrien,

I managed to do a terms child aggregation on Locations and am averaging out 
the geohashes in the browser.

The js code is here:
Query:
https://github.com/GlobAllomeTree/GlobAllomeTree/blob/elasticutils/globallometree/apps/allometric_equations/static/allometric_equations/js/map.js#L289
Parsing geohashes
https://github.com/GlobAllomeTree/GlobAllomeTree/blob/elasticutils/globallometree/apps/allometric_equations/static/allometric_equations/js/map.js#L352

It seems to work very well and is still speedy.

Kind regards,

Tom




On Thursday, May 1, 2014 10:59:59 PM UTC-7, Adrien Grand wrote:
>
> Hi,
>
> Unfortunately, accessing the parent bucket key is not possible.
>
>
> On Fri, May 2, 2014 at 12:04 AM, Thomas Gruner 
> 
> > wrote:
>
>> Hello!
>>
>> I have been progressing well with aggregations, but this one has got me 
>> stumped. 
>>
>> I'm trying to figure out how to access the key of the parent bucket from 
>> a child aggregation. 
>>
>> The parent bucket is geohash_grid, and the child aggregation is avg 
>> (trying to get avg lat and lon, but only for points that match the parent's 
>> bucket's geohash key)
>>
>> Something like this:
>> "aggregations" : { 
>> "LocationsGrid": {
>> "geohash_grid" : {
>> "field" : "Locations",
>> "precision" : 7, 
>> },
>> "aggregations" : {
>> "avg_lat": {
>> "avg": {
>> "script": "if 
>> (doc['Locations'].value.geohash.startsWith(*parent_bucket.key*)) 
>> doc['Locations'].value.lat;"
>> }
>> }
>> },
>> }
>> }
>>
>>
>> Thanks for any help or ideas with this!
>>
>>  -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/624d0bdd-c380-4c72-b642-e6afff3458a9%40googlegroups.com<https://groups.google.com/d/msgid/elasticsearch/624d0bdd-c380-4c72-b642-e6afff3458a9%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
>
> -- 
> Adrien Grand
>  

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ba85de25-ca5e-41ba-988f-8449624525d1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


access parent bucket's key from child aggregation in geohash grid

2014-05-01 Thread Thomas Gruner
Hello!

I have been progressing well with aggregations, but this one has got me 
stumped. 

I'm trying to figure out how to access the key of the parent bucket from a 
child aggregation. 

The parent bucket is geohash_grid, and the child aggregation is avg (trying 
to get avg lat and lon, but only for points that match the parent's 
bucket's geohash key)

Something like this:
"aggregations" : { 
"LocationsGrid": {
"geohash_grid" : {
"field" : "Locations",
"precision" : 7, 
},
"aggregations" : {
"avg_lat": {
"avg": {
"script": "if 
(doc['Locations'].value.geohash.startsWith(*parent_bucket.key*)) 
doc['Locations'].value.lat;"
}
}
},
}
}


Thanks for any help or ideas with this!

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/624d0bdd-c380-4c72-b642-e6afff3458a9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Upgrade cluster from 0.90.11 to 1.1.1

2014-04-27 Thread Thomas Ardal
Following the upgrade guide worked perfectly. Thank you very much for your 
help.

On Saturday, April 26, 2014 11:09:11 AM UTC+2, David Pilato wrote:
>
> Important part is:
>
> 0.90.x
>
> 1.x
>
> Restart Upgrade
> So you need to stop the whole cluster and restart.
>
> --
> David ;-)
> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
>
>
> Le 26 avr. 2014 à 11:03, Srividhya Umashanker 
> > 
> a écrit :
>
>
> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/setup-upgrade.html
>
> >= 0.90.7
>
> 0.90.x
>
> Rolling Upgrade
>
> 1.x
>
> 1.x
>
> Rolling Upgrade
> I guess the upgrade will upgrade to latest lucene automatically.
> 1. stop node 1.
> 2. install 1.1.1 on node 1.
> *[NOT REQUIRED] **3. copy data folder to 1.1.1.*
> 4. start node 1 and wait for it to synchronize.
>
> Let me know how it goes.
>
> On Saturday, April 26, 2014 12:50:37 PM UTC+5:30, Thomas Ardal wrote:
>>
>> I'm running a two-node cluster with Elasticsearch 0.90.11. I want to 
>> upgrade to the newest version (1.1.1), but I'm not entirely sure on how to 
>> do it. 0.90.11 is based on Lucene 4.6.1 and 1.1.1 on Lucene 4.7.2. Can I do 
>> the following:
>>
>> 1. stop node 1.
>> 2. install 1.1.1 on node 1.
>> 3. copy data folder to 1.1.1.
>> 4. start node 1 and wait for it to synchronize.
>> 5. stop node 2.
>> 6. install 1.1.1 on node 2.
>> 7. copy data folder to 1.1.1.
>> 8. start node 2 and wait for it to synchronize.
>>
>> I can live with downtime if not possible otherwise.
>>
>  -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearc...@googlegroups.com .
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/5682b4a1-b978-44f3-a419-a6055358006f%40googlegroups.com<https://groups.google.com/d/msgid/elasticsearch/5682b4a1-b978-44f3-a419-a6055358006f%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/4bc16a84-e2e2-4e68-8fd5-eca7daf0599b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Upgrade cluster from 0.90.11 to 1.1.1

2014-04-26 Thread Thomas Ardal
I'm running a two-node cluster with Elasticsearch 0.90.11. I want to 
upgrade to the newest version (1.1.1), but I'm not entirely sure on how to 
do it. 0.90.11 is based on Lucene 4.6.1 and 1.1.1 on Lucene 4.7.2. Can I do 
the following:

1. stop node 1.
2. install 1.1.1 on node 1.
3. copy data folder to 1.1.1.
4. start node 1 and wait for it to synchronize.
5. stop node 2.
6. install 1.1.1 on node 2.
7. copy data folder to 1.1.1.
8. start node 2 and wait for it to synchronize.

I can live with downtime if not possible otherwise.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/26670af5-f4d5-4d11-859b-bdbbc08367f1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Parent/Child combination Script possible?

2014-04-13 Thread Thomas
Thanks Sven,

Yes this would solve a lot of use cases..

Is there anyone that can respond whether we should create an issue for 
that? The link provided does not mention whether finally this should be 
opened as an issue

Thanks 
Thomas

On Friday, 11 April 2014 18:53:08 UTC+3, Thomas wrote:
>
> Hello,
>
>
> I have two document types which are utilizing a parent/child relation. I 
> want to perform an aggregation where the script utilizes fields from both 
> documents. Is that possible?
>
> More specifically:
>
> Parent Document 
>
> {
>>   "tag:{
>>"_id": {
>>  "path": "tag_id"
>>},
>>"properties": {
>>  "tag_id": {"index": "not_analyzed","type": "string"},
>>  "name": {"index": "not_analyzed","type": "string"}
>>  "tag_counter": {"type": "integer"}
>>}
>>   }
>> }
>
>
>
> Child Document
>
> {
>>   "click:{
>>"_parent": {
>>  "type": "tag"
>>},
>>"properties": {
>>  "type": {"index": "not_analyzed","type": "string"},
>>  "clicks_counter": {"type": "integer"}
>>}
>>   }
>> }
>
>
>
> curl -XGET "http://localhost:9200/tags-index/tags,clicks/_search"; -d'
>> {
>>"aggregations": {
>>   "one_day_filter": {
>>  "filter": {
>> "range": {
>>"ts": {
>>   "gte": "2014-03-15T00:00:00",
>>   "lt": "2014-03-15T01:00:00"
>>}
>> }
>>  },
>>  "aggregations": {
>>     "parent": {
>>"filter": {
>>   "has_child": {
>>  "type": "clicks",
>>  "query": {
>> "match_all": {}
>>  }
>>   }
>>},
>>"aggregations": {
>>   "metrics": {
>>  "terms": {
>> "script": "*doc[\"tags.tag_counter\"].value - 
>> "doc[\"clicks.clicks_counter\"].value*"
>> }
>>  }
>>   }
>>}
>> }
>>  }
>>   }
>>},
>>"size": 0
>> }'
>
>
>
> Thanks
> Thomas
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/9a188e78-e220-4afd-b840-e85ea0cade4e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Parent/Child combination Script possible?

2014-04-11 Thread Thomas
Hello,


I have two document types which are utilizing a parent/child relation. I 
want to perform an aggregation where the script utilizes fields from both 
documents. Is that possible?

More specifically:

Parent Document 

{
>   "tag:{
>"_id": {
>  "path": "tag_id"
>},
>"properties": {
>  "tag_id": {"index": "not_analyzed","type": "string"},
>  "name": {"index": "not_analyzed","type": "string"}
>  "tag_counter": {"type": "integer"}
>}
>   }
> }



Child Document

{
>   "click:{
>"_parent": {
>  "type": "tag"
>},
>"properties": {
>  "type": {"index": "not_analyzed","type": "string"},
>  "clicks_counter": {"type": "integer"}
>}
>   }
> }



curl -XGET "http://localhost:9200/tags-index/tags,clicks/_search"; -d'
> {
>"aggregations": {
>   "one_day_filter": {
>  "filter": {
> "range": {
>"ts": {
>   "gte": "2014-03-15T00:00:00",
>   "lt": "2014-03-15T01:00:00"
>}
> }
>  },
>  "aggregations": {
> "parent": {
>"filter": {
>   "has_child": {
>      "type": "clicks",
>  "query": {
> "match_all": {}
>  }
>   }
>},
>"aggregations": {
>   "metrics": {
>  "terms": {
> "script": "*doc[\"tags.tag_counter\"].value - 
> "doc[\"clicks.clicks_counter\"].value*"
> }
>  }
>   }
>}
> }
>  }
>   }
>},
>"size": 0
> }'



Thanks
Thomas

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/3ffee913-9e82-4862-befe-e0f7ff9038e8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Terms aggregation scripts running slower than expected

2014-04-09 Thread Thomas S.
Hi,

I am currently exploring the option of using scripts with aggregations and 
I noticed that for some reason scripts for terms aggregations are executed 
much slower than for other aggregations, even if the script doesn't access 
any fields yet. This also happens for native Java scripts. I'm running 
Elasticsearch 1.1.0.

For example, on my data set the simple script "1" takes around 400ms for 
the sum and histogram aggregations, but takes around 25s to run on a terms 
aggregation, even on repeated runs. What is going on here? Terms 
aggregations without a script are very fast, and histogram/sum aggregations 
with scripts that access the document are also very fast: I had to 
transform a script aggregation that should have been a terms aggregation 
into a histogram and convert the numeric values back into terms on the 
client so the aggregation would be executed in reasonable time.


In [2]: app.search.search({'size': 0, 'query': { 'match_all': {} }, 
'aggregations': { 'test_script': { 'terms': { 'script': '1' } } }})
Out[2]:
{u'_shards': {u'failed': 0, u'successful': 246, u'total': 246},
 u'aggregations': {u'test_script': {u'buckets': [{u'doc_count': 4231327,
 u'key': u'1'}]}},
 u'hits': {u'hits': [], u'max_score': 0.0, u'total': 4231327},
 u'timed_out': False,
 u'took': 24986}


In [10]: app.search.search({'size': 0, 'query': { 'match_all': {} }, 
'aggregations': { 'test_script': { 'sum': { 'script': '1' } } }})
Out[10]:
{u'_shards': {u'failed': 0, u'successful': 246, u'total': 246},
 u'aggregations': {u'test_script': {u'value': 4231327.0}},
 u'hits': {u'hits': [], u'max_score': 0.0, u'total': 4231327},
 u'timed_out': False,
 u'took': 363}


In [8]: app.search.search({'size': 0, 'query': { 'match_all': {} }, 
'aggregations': { 'test_script': { 'histogram': { 'script': '1', 
'interval': 1 } } }})
Out[8]:
{u'_shards': {u'failed': 0, u'successful': 246, u'total': 246},
 u'aggregations': {u'test_script': {u'buckets': [{u'doc_count': 4231327,
 u'key': 1}]}},
 u'hits': {u'hits': [], u'max_score': 0.0, u'total': 4231327},
 u'timed_out': False,
 u'took': 421}


Thomas

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/4af8942c-db46-47fa-9d38-370051a15c5c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: query field computed by child query?

2014-03-31 Thread Thomas Andres

Thanks for the examples. Looks quite interesting. If I understand that 
correctly, I'd have to write a plugin doing my subquery. Too bad I don't 
have much time right now :( Sounds like an interesting challenge :)

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/f76b7750-fcba-48dd-a1be-61d385b12bd4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: query field computed by child query?

2014-03-28 Thread Thomas Andres
I want to return "all" parents (or those matching some other query 
conditions) but in addition to the other data in the document, I want to 
compute for each parent, if he has any child with a set error flag. I don't 
want to filter on this condition in this case.

Am Freitag, 28. März 2014 14:21:30 UTC+1 schrieb Binh Ly:
>
> Not sure I understand. So if you run a _search on the parent, and use the 
> has_child filter to return only parents that match some child condition, is 
> that not what you want? 
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c99eb564-0767-4adc-b2f6-e4ca00c879a3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


query field computed by child query?

2014-03-28 Thread Thomas Andres
I have documents in a parent/child relation. In a query run on the parent, 
I'd like to know, if the found parents have children matching some query. I 
don't want to filter only parents with some conditions on the child, but 
only get the information, that they have childrens matching some query.

Any idea if that's possible? I've been thinking maybe adding a scrip_field 
that would compute that, but have no idea how to run child queries from a 
script field.

An example to clarify my problem:
child has a boolean field "error"

I run a query on the parent and want to show an information if any of the 
children has the error flag set.

Any hint would be welcome.


-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c5bbf2eb-7960-4740-9e0c-a70dbe98a9aa%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Inconsistent search cluster status and search results after long GC run

2014-03-27 Thread Thomas S.
Forgot to reply to your questions, Binh:

1) No I haven't set this. However I wonder if this has any significant 
effect since swap space is barely used.
2) It seems to happen when the cluster is under high load but I haven't 
seen any specific pattern so far.
3) No there's not. There's a very small Redis instance running on node1, 
but there's nothing else on the nodes with shards (where the GC problem 
happens).

If I was going to disable master on any node that has shards I'd have to 
add another dummy node with master:true so the cluster is in good state if 
any one of the nodes is down.


On Thursday, March 27, 2014 4:46:41 PM UTC+1, Binh Ly wrote:
>
> I would probably not master enable any node that can potentially gc for a 
> couple seconds. You want your master-eligible nodes to make decisions as 
> quick as possible.
>
> About your GC situation, I'd find out what the underlying cause is:
>
> 1) Do you have bootstrap.mlockall set to true?
>
> 2) Does it usually triggered while running queries? Or is there a pattern 
> on when it usually triggers?
>
> 3) Is there anything else running on these nodes that would overload and 
> affect normal ES operations?
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/0ae93e7c-a6f7-4784-8b4a-71d6f52552a7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Inconsistent search cluster status and search results after long GC run

2014-03-27 Thread Thomas S.
Thanks Jörg,

I can increase the ping_timeout to 60s for now. However, shouldn't the goal 
be to minimize the time GC runs? Is the node blocked when GC runs and will 
delay any requests to it? If so, then it would be very bad to allow long GC 
runs.

Regarding the bulk thread pool: I specifically set this to a higher value 
to avoid errors when we perform bulk indexing (we had errors sometimes when 
the queue was full and set to 50. I was also going to increase the "index" 
queue since there are sometimes errors). I will try keeping the limit and 
give it more heap space to indexing instead, as you suggested.

Regarding Java 8: We're currently running Java 7 and haven't tweaked any GC 
specific settings. Do you think it makes sense to already switch to Java 8 
on production and enable the G1 garbage collector?

Thanks again,
Thomas

On Thursday, March 27, 2014 9:41:10 PM UTC+1, Jörg Prante wrote:
>
> It seems you run into trouble because you changed some of the default 
> settings, worsening your situation.
>
> Increase ping_timout from 9s to 60s as first band aid - you have GCs with 
> 35secs running.
>
> You should reduce the bulk thread pool of 100 to 50, this reduces high 
> memory pressure on the 20% memory you allow. Give more heap space to 
> indexing, use 50% instead of 20%.
>
> Better help would be to diagnose the nodes if you exceed the capacity for 
> search and index operations. If so, think about adding nodes.
>
> More finetuning after adding nodes could include G1 GC with Java 8, which 
> is targeted to minimize GC stalls. This would not solve node capacity 
> problems though.
>
> Jörg
>
>
> On Thu, Mar 27, 2014 at 4:46 PM, Binh Ly  >wrote:
>
>> I would probably not master enable any node that can potentially gc for a 
>> couple seconds. You want your master-eligible nodes to make decisions as 
>> quick as possible.
>>
>> About your GC situation, I'd find out what the underlying cause is:
>>
>> 1) Do you have bootstrap.mlockall set to true?
>>
>> 2) Does it usually triggered while running queries? Or is there a pattern 
>> on when it usually triggers?
>>
>> 3) Is there anything else running on these nodes that would overload and 
>> affect normal ES operations?
>>  
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/cd594a91-00c4-43ae-97d8-bbda35618d8e%40googlegroups.com<https://groups.google.com/d/msgid/elasticsearch/cd594a91-00c4-43ae-97d8-bbda35618d8e%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/86db1b12-038f-47d6-9fac-9e8eb8314dbc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Split brain problem on Azure

2014-03-27 Thread Thomas Ardal
Ok. Also using the zen.* keys?

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/7e5eaf00-b943-4c64-aaad-a5ac78400ea3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Split brain problem on Azure

2014-03-27 Thread Thomas Ardal
I'm experiencing split brain problem on my Elasticsearch cluster on Azure, 
consisting of two nodes. I've read about the zen.ping.timeout 
and discovery.zen.minimum_master_nodes settings, but I guess that I can't 
use those settings, when using the Azure plugin. Any ideas for avoiding 
split brain using the Azure plugin?

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/169394c1-6a8c-430e-a9c1-0286b1789fce%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Inconsistent search cluster status and search results after long GC run

2014-03-27 Thread Thomas S.
)
at 
org.elasticsearch.discovery.zen.ZenDiscovery$7.execute(ZenDiscovery.java:556)
at 
org.elasticsearch.cluster.service.InternalClusterService$UpdateTask.run(InternalClusterService.java:308)
at 
org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:134)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:724)
Caused by: org.elasticsearch.transport.NodeNotConnectedException: 
[node2][inet[/10.216.32.81:9300]] Node not connected
at 
org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTransport.java:859)
at 
org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:540)
at 
org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:189)
... 7 more


NODE 2
[2014-03-27 07:19:02,871][INFO ][cluster.service  ] [node2] removed 
{[node3][RRqWlTWnQ7ygvsOaJS0_mA][node3][inet[/10.235.38.84:9300]]{master=true},},
 
reason: 
zen-disco-node_failed([node3][RRqWlTWnQ7ygvsOaJS0_mA][node3][inet[/10.235.38.84:9
300]]{master=true}), reason failed to ping, tried [2] times, each with 
maximum [9s] timeout


NODE 3
[2014-03-27 07:19:20,055][WARN ][monitor.jvm  ] [node3] 
[gc][old][539697][754] duration [35.1s], collections [1]/[35.8s], total 
[35.1s]/[2.7m], memory [4.9gb]->[4.2gb]/[7.9gb], all_pools {[young] 
[237.8mb]->[7.4mb]/[266.2mb]}{[survivor] [25.5mb]->[0b]/[33
.2mb]}{[old] [4.6gb]->[4.2gb]/[7.6gb]}
[2014-03-27 07:19:20,112][INFO ][discovery.zen] [node3] 
master_left 
[[node2][A45sMYqtQsGrwY5exK0sEg][node2][inet[/10.216.32.81:9300]]{master=true}],
 
reason [do not exists on master, act as master failure]
[2014-03-27 07:19:20,117][INFO ][cluster.service  ] [node3] master 
{new 
[node1][DxlcpaqOTmmpNSRoqt1sZg][node1.example][inet[/10.252.78.88:9300]]{data=false,
 
master=true}, previous 
[node2][A45sMYqtQsGrwY5exK0sEg][node2][inet[/10.216.32.81:9300
]]{master=true}}, removed 
{[node2][A45sMYqtQsGrwY5exK0sEg][node2][inet[/10.216.32.81:9300]]{master=true},},
 
reason: zen-disco-master_failed 
([node2][A45sMYqtQsGrwY5exK0sEg][node2][inet[/10.216.32.81:9300]]{master=true})


After this scenario, the cluster doesn't recover properly: The worst thing 
is that node 1 sees nodes 1+3, node 2 sees nodes 1+2 and node 3 sees nodes 
1+3. Since the cluster is set up to operate with two nodes, both data nodes 
2 and 3 accept data and searches, causing inconsistent results and 
requiring us to do a full cluster restart and reindex all production data 
to make sure the cluster is consistent again.


NODE 1 (GET /_nodes):
{
  "cluster_name" : "elasticsearch",
  "nodes" : {
"DxlcpaqOTmmpNSRoqt1sZg" : {
  "name" : "node1",
  ...
},
"RRqWlTWnQ7ygvsOaJS0_mA" : {
  "name" : "node3",
  ...
}
  }
}

NODE 2 (GET /_nodes):
{
  "cluster_name" : "elasticsearch",
  "nodes" : {
"A45sMYqtQsGrwY5exK0sEg" : {
  "name" : "node2",
  ...
},
"DxlcpaqOTmmpNSRoqt1sZg" : {
  "name" : "node1",
  ...
}
  }
}

NODE 3 (GET /_nodes):
{
  "cluster_name" : "elasticsearch",
  "nodes" : {
"DxlcpaqOTmmpNSRoqt1sZg" : {
  "name" : "node1",
  ...
},
"RRqWlTWnQ7ygvsOaJS0_mA" : {
  "name" : "node3",
  ...
}
  }
}


Here are the configurations:

BASE CONFIG (for all nodes):
action:
  disable_delete_all_indices: true
discovery:
  zen:
fd:
  ping_retries: 2
  ping_timeout: 9s
minimum_master_nodes: 2
ping:
  multicast:
enabled: false
  unicast:
hosts: ["node1.example", "node2.example", "node3.example"]
index:
  fielddata:
cache: node
indices:
  fielddata:
cache:
  size: 40%
  memory:
index_buffer_size: 20%
threadpool:
  bulk:
queue_size: 100
type: fixed
transport:
  tcp:
connect_timeout: 3s

NODE 1:
node:
  data: false
  master: true
  name: node1

NODE 2:
node:
  data: true
  master: true
  name: node2

NODE 3:
node:
  data: true
  master: true
  name: node3


Questions:
1) What can we do to minimize long GC runs, so the nodes don't become 
unresponsive and disconnect in the first place? (FYI: Our index is 
currently about 80 GB in size with over 2M docs (per node), 60 shards, heap 
size 8 GB. We run both searches and aggregations on it.)
2) Obviously, having the cluster state in a state like the above is 
unacceptable and we therefore want to make sure that even if a node 
disconnects be

Re: Delete by query fails often with HTTP 503

2014-03-18 Thread Thomas S.
Thanks Clint,

We have two nodes with 60 shards per node. I will increase the queue size. 
Hopefully this will reduce the amount of rejections.

Thomas


On Tuesday, March 18, 2014 6:11:27 PM UTC+1, Clinton Gormley wrote:
>
> Do you have lots of shards on just a few nodes? Delete by query is handled 
> by the `index` thread pool, but those threads are shared across all shards 
> on a node.  Delete by query can produce a large number of changes, which 
> can fill up the thread pool queue and result in rejections.
>
> You can either just (a) retry or (b) increase the queue size for the 
> `index` thread pool (which will use more memory as more delete requests 
> will need to be queued)
>
> See 
> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-threadpool.html#types
>
> clint
>
>
> On 18 March 2014 08:13, Thomas S. > wrote:
>
>> Hi,
>>
>> We often get failures when using the delete by query API. The response is 
>> an HTTP 503 with a body like this:
>>
>> {"_indices": {"myindex": {"_shards": {"successful": 2, "failed": 58, 
>> "total": 60
>>
>> Is there a way to figure out what is causing this error? It seems to 
>> mostly happen when the search cluster is busy.
>>
>> Thomas
>>
>>  -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/f8c84eaf-79b9-4f4e-9b26-732d11544fb9%40googlegroups.com<https://groups.google.com/d/msgid/elasticsearch/f8c84eaf-79b9-4f4e-9b26-732d11544fb9%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/b815184a-8382-4b25-8a54-b98753f6cbb4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Delete by query fails often with HTTP 503

2014-03-18 Thread Thomas S.
Hi,

We often get failures when using the delete by query API. The response is 
an HTTP 503 with a body like this:

{"_indices": {"myindex": {"_shards": {"successful": 2, "failed": 58, 
"total": 60

Is there a way to figure out what is causing this error? It seems to mostly 
happen when the search cluster is busy.

Thomas

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/f8c84eaf-79b9-4f4e-9b26-732d11544fb9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: TransportClient timeout / webserver configuration - JAVA Api

2014-03-10 Thread Thomas
As per the documentation Client is threadsafe and is suggested by the 
elasticsearch team to use the same client instance across your app. 
Considering your exception above you might need to look your configuration 
first (like cluster name and host/port) and you should use port 9300 for 
the Java API. Finally, check whether you are under a firewall or something 
and you block 9300 port.

Hope it helps
Thomas

On Monday, 10 March 2014 15:42:25 UTC+2, Robert Langenfeld wrote:
>
> Hello,
>
> I'm developing a tomcat webserver application that uses ElasticSearch 1.0 
> (Java API). There is a client facing desktop application that communicates 
> with the server so all the code for ElasticSearch is on that one instance 
> and it is used by all our clients. With that being said I am running into 
> this issue: After initializing a new TransportClient object and performing 
> some operation on it, there is a chance that i could sit idle for a very 
> long time. When does sit idle for a long time it gets this error:
>
>
> Mar 08, 2014 1:15:37 AM org.elasticsearch.client.transport
>
> INFO: [Elven] failed to get node info for 
> [#transport#-1][WIN7-113-00726][inet[/159.140.213.87:9300]], 
> disconnecting...
>
> org.elasticsearch.transport.RemoteTransportException: 
> [Server_Dev1][inet[/159.140.213.87:9300]][cluster/nodes/info]
>
> Caused by: java.lang.NullPointerException
>
> at org.elasticsearch.http.HttpInfo.writeTo(HttpInfo.java:82)
>
> at 
> org.elasticsearch.action.admin.cluster.node.info.NodeInfo.writeTo(NodeInfo.java:301)
>
> at 
> org.elasticsearch.action.admin.cluster.node.info.NodesInfoResponse.writeTo(NodesInfoResponse.java:63)
>
> at 
> org.elasticsearch.transport.netty.NettyTransportChannel.sendResponse(NettyTransportChannel.java:83)
>
> at 
> org.elasticsearch.action.support.nodes.TransportNodesOperationAction$TransportHandler$1.onResponse(TransportNodesOperationAction.java:244)
>
> at 
> org.elasticsearch.action.support.nodes.TransportNodesOperationAction$TransportHandler$1.onResponse(TransportNodesOperationAction.java:239)
>
> at 
> org.elasticsearch.action.support.nodes.TransportNodesOperationAction$AsyncAction.finishHim(TransportNodesOperationAction.java:225)
>
> at 
> org.elasticsearch.action.support.nodes.TransportNodesOperationAction$AsyncAction.onOperation(TransportNodesOperationAction.java:200)
>
> at 
> org.elasticsearch.action.support.nodes.TransportNodesOperationAction$AsyncAction.access$900(TransportNodesOperationAction.java:102)
>
> at 
> org.elasticsearch.action.support.nodes.TransportNodesOperationAction$AsyncAction$2.run(TransportNodesOperationAction.java:146)
>
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>
> at java.lang.Thread.run(Thread.java:744)
>
> Is there any way to prevent this from happening? I know the ideal 
> situation would be that after every request the transport client is closed. 
> But since it lives on a webserver with lots of search requests coming in, 
> we would ideally like it to stay open because it takes 3-4 seconds for a 
> transport client to initialize and we are going for speed here.
>
> Also since we are having one central server to handle all search and index 
> requests, can the TransportClient handle multiple simultaneous requests 
> from different users at the same time? We just want to make sure that we 
> are doing this correctly.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/d625d553-2ed5-456a-9180-7b423874b43e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Unable to load script under config/scripts

2014-03-10 Thread Thomas
Hi,

I'm trying to keep some scripts within config/scripts but elasticsearch 
seems that it cannot locate them. What could be a possible reason for this?

When need to invoke it es fails with the following 

No such property:  for class: Script1

Any ideas?

Thanks

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/f96b90e4-7704-49e0-8dd6-38ef1ebe6558%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Queue capacity and EsRejectedExecutionException leads to loss of data

2014-02-26 Thread Thomas
Thanks David,

So this is a rabbitMQRiver issue, is there a need to open a separate issue? 
(Never done the procedure, will look this one)

Thomas

On Wednesday, 26 February 2014 15:48:55 UTC+2, Thomas wrote:
>
> Hi,
>
> We have installed the RabbitMQ river plugin to pull data from our Queue 
> and adding them to ES. The thing is that at some point we are receiving the 
> following exception and we have as a result to *lose data*.
>
> [1775]: index [events-idx], type [click], id 
>> [3f6e4604146b435aabcf4ea5a493fd32], message 
>> [EsRejectedExecutionException[rejected execution (queue capacity 50) on 
>> org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$1@12843ca2]]
>
>
> We have changed the configuration  of queue size to 1000 and the problem 
> disappeared. 
>
> My question is that is there any configuration/way to tell ES to instead 
> of throwing this exception and discarding the document to wait for 
> available resources (with the corresponding performance impact)?
>
> Thanks
>
> Thomas
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/03dcb0ea-2b6a-478b-b678-f52ecbc09298%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Queue capacity and EsRejectedExecutionException leads to loss of data

2014-02-26 Thread Thomas
Hi,

We have installed the RabbitMQ river plugin to pull data from our Queue and 
adding them to ES. The thing is that at some point we are receiving the 
following exception and we have as a result to *lose data*.

[1775]: index [events-idx], type [click], id 
> [3f6e4604146b435aabcf4ea5a493fd32], message 
> [EsRejectedExecutionException[rejected execution (queue capacity 50) on 
> org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$1@12843ca2]]


We have changed the configuration  of queue size to 1000 and the problem 
disappeared. 

My question is that is there any configuration/way to tell ES to instead of 
throwing this exception and discarding the document to wait for available 
resources (with the corresponding performance impact)?

Thanks

Thomas


-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/4e11ce4e-26d4-4a53-a53e-0ba89bde4605%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Marvel houskeeping

2014-02-24 Thread Thomas Andres
I know deleting them isn't hard, and it's good to know there is a tool to 
automate that. However, I do think that Marvel should do that itself. 
Shouldn't be too hard to e.g.  extend the code that creates a new daily 
index to also cleanup old ones.

As it is now, you install a plugin and suddenly run out of space, which I 
don't consider a good default behaviour (I know you guys take care of 
setting smart default values, which is one reason elasticsearch is so 
good!). I think this would be a small extension, that probably  prevents 
many users from a rather bad surprise.

Cheers
Thomas


Am Freitag, 14. Februar 2014 20:40:26 UTC+1 schrieb Boaz Leskes:
>
>
> Marvel itself doesn't have a setting for this, but you can have a look at 
> this tool, built by the logstash team to help management indices with time 
> based data: https://github.com/elasticsearch/curator
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/d77c267c-1f88-4d58-9029-f69ae52ac409%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Best way to extract a CSV with aggregations

2014-02-18 Thread Thomas
Hi,

I was wondering what is the best way to extract a csv with the various 
terms of some fields alongside with the corresponding counts? For example 
create a csv as follows:

productId, product, category, city, hour, products_count, 
> selled_products_count


Will cat API be useful in this case? Or I have to pass from multiple 
aggregation calls to construct this information?

Thanks
Thomas

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/6756e0e8-cf82-40b8-9bd5-8713bb85aed8%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Avoiding duplicate documents with versioning

2014-02-18 Thread Thomas
Just for any other people that might find this post useful, finally we 
managed to get the expected functionality as described here

Thanks
Thomas

On Saturday, 15 February 2014 16:53:20 UTC+2, Thomas wrote:
>
> Hi,
>
> First of all congrats for the 1.0 release!! Thumbs up for the aggregation 
> framework :)
>
> I'm trying to build a system which is kind of querying for analytics. I 
> have a document called *event*, and I have events of specific type (e.g. 
> click open etc.) per page. So per page i might have for example an *open 
> event*. The thing is that I might as well take the open event *more than 
> once*, but I want to count it only once. So I use the versioning API and 
> I provide the same document id having as a result the version to increase. 
>
> In my queries I use the _timestamp field to determine the last document 
> that I counted. But my problem is that since ES reindex the document, it 
> updates _timestamp so it seems as recent document, and in my queries I 
> count it again.
>
> Is there a way to simply *discard* the document if the document with the 
> same id exists, without stopping the bulk operation of uploading documents?
>
> Thanks 
> Thomas
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/49af9451-023c-4c49-9211-255b07ca2191%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Avoiding duplicate documents with versioning

2014-02-15 Thread Thomas
Just an update,

If we use the op_type=create in the index request, will probably discard 
the duplicate document. But, in the case where we do a bulk operation will 
it stop the bulk upload? or will generate the error and move on to the next 
document?

thanks

On Saturday, 15 February 2014 16:53:20 UTC+2, Thomas wrote:
>
> Hi,
>
> First of all congrats for the 1.0 release!! Thumbs up for the aggregation 
> framework :)
>
> I'm trying to build a system which is kind of querying for analytics. I 
> have a document called *event*, and I have events of specific type (e.g. 
> click open etc.) per page. So per page i might have for example an *open 
> event*. The thing is that I might as well take the open event *more than 
> once*, but I want to count it only once. So I use the versioning API and 
> I provide the same document id having as a result the version to increase. 
>
> In my queries I use the _timestamp field to determine the last document 
> that I counted. But my problem is that since ES reindex the document, it 
> updates _timestamp so it seems as recent document, and in my queries I 
> count it again.
>
> Is there a way to simply *discard* the document if the document with the 
> same id exists, without stopping the bulk operation of uploading documents?
>
> Thanks 
> Thomas
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/dbf19235-5b76-4a09-8b86-9a0fbf7e8d1c%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Avoiding duplicate documents with versioning

2014-02-15 Thread Thomas
Hi,

First of all congrats for the 1.0 release!! Thumbs up for the aggregation 
framework :)

I'm trying to build a system which is kind of querying for analytics. I 
have a document called *event*, and I have events of specific type (e.g. 
click open etc.) per page. So per page i might have for example an *open 
event*. The thing is that I might as well take the open event *more than 
once*, but I want to count it only once. So I use the versioning API and I 
provide the same document id having as a result the version to increase. 

In my queries I use the _timestamp field to determine the last document 
that I counted. But my problem is that since ES reindex the document, it 
updates _timestamp so it seems as recent document, and in my queries I 
count it again.

Is there a way to simply *discard* the document if the document with the 
same id exists, without stopping the bulk operation of uploading documents?

Thanks 
Thomas

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/15a8062b-a60c-4c2e-ae41-6dd31b4b360b%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Marvel houskeeping

2014-02-14 Thread Thomas Andres
I upgraded elasticsearch to 0.90.11 and installed marvel. Congratulations 
on a really nice tool!

Now I have a small issue: since marvel is generating quite a lot of data 
(for our develop system), I would like to configure an automatic delete of 
old data. Is there such an option? I didn't find anything in the 
documentation. It would be great to specify a rolling window of n days of 
data to keep.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/90ac3f1f-23c4-461f-95d5-f054f1fc5706%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


ElasticSearch Analytics Capabilites

2014-02-13 Thread Binil Thomas
ES seems to have ability to run analytic queries. I have read about people 
using it as an OLAP solution [1], although I have not yet read anyone 
describe their experience. In that respect how does ES analytics 
capabilities compare against:

1) Dremel clones [2] like Impala & Presto (for near real-time, ad hoc 
analytic queries over large datasets)
2) Lambda Architecture [3] systems (where queries are known up- front, but 
need to run against a large dataset)

Does anyone here have experience running ES in such usecases, beyond the 
free text searching one ES is well-known for?

Thanks,
Binil

[1]: https://groups.google.com/forum/#!topic/elasticsearch/iTy9IYL23as
[2]: 
http://static.googleusercontent.com/media/research.google.com/en/us/pubs/archive/36632.pdf
[3]: 
http://jameskinley.tumblr.com/post/37398560534/the-lambda-architecture-principles-for-architecting

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/5c75a380-3971-45cd-b10d-a91b3b97ecc3%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: EC2 Discovery is not working with AutoScaling group (AWS)

2014-02-07 Thread Thomas FATTAL
Finally, I fixed my problem.
There was a mistake for the field "discovery.ec2.groups". Instead of a 
string, I had to put an array of string.
And I also forgot to add the tag "platform:prod" to CloudFormation when 
launching my stack. 

Fixed!

On Friday, 7 February 2014 14:54:05 UTC+1, Thomas FATTAL wrote:
>
> Hi,
>
> I'm trying to configure two Elasticsearch nodes in AWS in the same 
> autoscaling group (CloudFormation).
> I am having some problems with them discovering each other.
>
> The following shows the elasticsearch.log I have on the first machine with 
> the instance-id "i-2db5db03".
> The second machine has an instance-id "i-324e6612".
>
> It seems that both nodes recognize each other, thanks to 
> discovery.ec2.tag.* field I added but then there are some problems that 
> make them not to join together:
>
> [2014-02-07 13:17:08,852][INFO ][node ] 
> [ip-10-238-225-133.ec2.internal] version[1.0.0.Beta2], pid[15342], 
> build[296cfbe/2013-12-02T15:46:27Z]
> [2014-02-07 13:17:08,853][INFO ][node ] 
> [ip-10-238-225-133.ec2.internal] initializing ...
> [2014-02-07 13:17:08,917][INFO ][plugins  ] 
> [ip-10-238-225-133.ec2.internal] loaded [cloud-aws], sites [paramedic]
> [2014-02-07 13:17:15,452][DEBUG][discovery.zen.ping.unicast] 
> [ip-10-238-225-133.ec2.internal] using initial hosts [], with 
> concurrent_connects [10]
> [2014-02-07 13:17:15,455][DEBUG][discovery.ec2] 
> [ip-10-238-225-133.ec2.internal] using ping.timeout [3s], 
> master_election.filter_client [true], master_election.filter_data [false]
> [2014-02-07 13:17:15,456][DEBUG][discovery.zen.elect  ] 
> [ip-10-238-225-133.ec2.internal] using minimum_master_nodes [1]
> [2014-02-07 13:17:15,457][DEBUG][discovery.zen.fd ] 
> [ip-10-238-225-133.ec2.internal] [master] uses ping_interval [1s], 
> ping_timeout [30s], ping_retries [3]
> [2014-02-07 13:17:15,500][DEBUG][discovery.zen.fd ] 
> [ip-10-238-225-133.ec2.internal] [node  ] uses ping_interval [1s], 
> ping_timeout [30s], ping_retries [3]
> [2014-02-07 13:17:16,769][DEBUG][discovery.ec2] 
> [ip-10-238-225-133.ec2.internal] using host_type [PRIVATE_IP], tags 
> [{platform=prod}], groups [[]] with any_group [true], availability_zones 
> [[]]
> [2014-02-07 13:17:19,930][INFO ][node ] 
> [ip-10-238-225-133.ec2.internal] initialized
> [2014-02-07 13:17:19,931][INFO ][node ] 
> [ip-10-238-225-133.ec2.internal] starting ...
> [2014-02-07 13:17:20,455][INFO ][transport] 
> [ip-10-238-225-133.ec2.internal] bound_address {inet[/0.0.0.0:9300]}, 
> publish_address {inet[/10.238.225.133:9300]}
> [2014-02-07 13:17:20,527][TRACE][discovery] 
> [ip-10-238-225-133.ec2.internal] waiting for 30s for the initial state to 
> be set by the discovery
> [2014-02-07 13:17:21,981][TRACE][discovery.ec2] 
> [ip-10-238-225-133.ec2.internal] building dynamic unicast discovery nodes...
> [2014-02-07 13:17:21,982][TRACE][discovery.ec2] 
> [ip-10-238-225-133.ec2.internal] filtering out instance i-2db5db03 based 
> tags {platform=prod}, not part of [{Key: aws:cloudformation:stack-id, 
> Value: 
> arn:aws:cloudformation:us-east-1:876119091332:stack/ES-10/daf53050-8ff8-11e3-bdce-50e241629418,
>  
> }, {Key: aws:cloudformation:stack-name, Value: ES-10, }, {Key: 
> aws:cloudformation:logical-id, Value: ESASG, }, {Key: 
> aws:autoscaling:groupName, Value: ES-10-ESASG-BHGX7KKQ9QPR, }]
> [2014-02-07 13:17:21,983][TRACE][discovery.ec2] 
> [ip-10-238-225-133.ec2.internal] filtering out instance i-324e6612 based 
> tags {platform=prod}, not part of [{Key: aws:cloudformation:logical-id, 
> Value: ESASG, }, {Key: aws:cloudformation:stack-id, Value: 
> arn:aws:cloudformation:us-east-1:876119091332:stack/ES-10/daf53050-8ff8-11e3-bdce-50e241629418,
>  
> }, {Key: aws:cloudformation:stack-name, Value: ES-10, }, {Key: 
> aws:autoscaling:groupName, Value: ES-10-ESASG-BHGX7KKQ9QPR, }]
> [2014-02-07 13:17:21,983][DEBUG][discovery.ec2] 
> [ip-10-238-225-133.ec2.internal] using dynamic discovery nodes []
> [2014-02-07 13:17:23,744][TRACE][discovery.ec2] 
> [ip-10-238-225-133.ec2.internal] building dynamic unicast discovery nodes...
> [2014-02-07 13:17:23,745][TRACE][discovery.ec2] 
> [ip-10-238-225-133.ec2.internal] filtering out instance i-2db5db03 based 
> tags {platform=prod}, not part of [{Key: aws:cloudformation:stack-id, 
> Value: 
> arn:aws:cloudformation:us-east-1:876119091332:stack/ES-10/daf53050-8ff8-11e3-bdce-50e241629418,
>  
> }, {Key: aws:cloudformation:stack-name, Value: ES-10, }, {Key: 
> aws:cloudfor

Deployment of a ES cluster on AWS

2014-02-06 Thread Thomas FATTAL
Hi!

I want to deploy a cluster of Elasticsearch nodes on AWS.
All our existing infrastructure is using CloudFormation with Chef 
cookbooks. We also did setup AutoScaling Group to restart application nodes 
automatically when some are going down.

I have several questions concerning the ES cluster I try to setup:
1) I was wondering what are the best practices for managing a ES cluster on 
AWS. Is it recommended to put the EC2 ES nodes in an auto-scaling group as 
well? Or is it a problem for the EC2 discovery?

2) If the CPU goes at 100% on a machine, is it recommended to upgrade the 
type of the machine to something more powerful or to add a new node?

3) Is there a recommended configuration schema in term of number of nodes 
in the cluster ?

Thanks a lot for your answer,
Thomas (@nypias)

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/441c3b42-0fb6-427d-b520-85f8c8ba1fee%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: There were no results because no indices were found that match your selected time span

2014-02-02 Thread Thomas Ardal
Okay, thanks!

On Tuesday, January 28, 2014 8:53:27 PM UTC+1, David Pilato wrote:
>
> Should work from 0.90.9. 
>
> -- 
> *David Pilato* | *Technical Advocate* | *Elasticsearch.com*
> @dadoonet <https://twitter.com/dadoonet> | 
> @elasticsearchfr<https://twitter.com/elasticsearchfr>
>
>
> Le 28 janvier 2014 at 20:51:14, Thomas Ardal 
> (thoma...@gmail.com) 
> a écrit:
>
> I know and that's the plan. But with 1.0.0 right around the corner and a 
> lot of data to migrate, I'll probably wait for that one. 
>
> Does Marvel only support the most recent versions of ES?
>
> On Tuesday, January 28, 2014 8:43:26 PM UTC+1, David Pilato wrote: 
>>
>>  0.90.1?
>> You should update to 0.90.10.
>>
>> --
>> David ;-) 
>> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
>>  
>> Le 28 janv. 2014 à 20:11, Thomas Ardal  a écrit :
>>
>>  As bonus info I'm running Elasticsearch 0.90.1 on windows server 2012. 
>> I'm using the Jetty plugin to force https and basic authentication, but are 
>> accessing Marvel from localhost through http. My browser asks me for 
>> credentials when opening the Marvel url, so it could be caused by the basic 
>> authentication setup. Or?
>>
>> On Tuesday, January 28, 2014 8:01:21 PM UTC+1, Thomas Ardal wrote: 
>>>
>>> When trying out Marvel on my Elasticsearch installation, I get the error 
>>> "There were no results because no indices were found that match your 
>>> selected time span" in the top of the page. 
>>>
>>> If I understand the documentation, Marvel automatically collects 
>>> statistics from all indexes on the node. What am I doing wrong?
>>>  
>>  --
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/7255ee52-5101-4942-8abd-b29642035237%40googlegroups.com
>> .
>> For more options, visit https://groups.google.com/groups/opt_out.
>>
>>   --
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearc...@googlegroups.com .
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/8bee9ba2-d0bf-42c3-b8ac-2c45707b9f96%40googlegroups.com
> .
> For more options, visit https://groups.google.com/groups/opt_out.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c2e396aa-7fcb-4257-ba10-c5b89827f662%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: There were no results because no indices were found that match your selected time span

2014-01-28 Thread Thomas Ardal
I know and that's the plan. But with 1.0.0 right around the corner and a 
lot of data to migrate, I'll probably wait for that one.

Does Marvel only support the most recent versions of ES?

On Tuesday, January 28, 2014 8:43:26 PM UTC+1, David Pilato wrote:
>
> 0.90.1?
> You should update to 0.90.10.
>
> --
> David ;-)
> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
>
> Le 28 janv. 2014 à 20:11, Thomas Ardal > 
> a écrit :
>
> As bonus info I'm running Elasticsearch 0.90.1 on windows server 2012. I'm 
> using the Jetty plugin to force https and basic authentication, but are 
> accessing Marvel from localhost through http. My browser asks me for 
> credentials when opening the Marvel url, so it could be caused by the basic 
> authentication setup. Or?
>
> On Tuesday, January 28, 2014 8:01:21 PM UTC+1, Thomas Ardal wrote:
>>
>> When trying out Marvel on my Elasticsearch installation, I get the error 
>> "There were no results because no indices were found that match your 
>> selected time span" in the top of the page.
>>
>> If I understand the documentation, Marvel automatically collects 
>> statistics from all indexes on the node. What am I doing wrong?
>>
>  -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearc...@googlegroups.com .
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/7255ee52-5101-4942-8abd-b29642035237%40googlegroups.com
> .
> For more options, visit https://groups.google.com/groups/opt_out.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/8bee9ba2-d0bf-42c3-b8ac-2c45707b9f96%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: There were no results because no indices were found that match your selected time span

2014-01-28 Thread Thomas Ardal
As bonus info I'm running Elasticsearch 0.90.1 on windows server 2012. I'm 
using the Jetty plugin to force https and basic authentication, but are 
accessing Marvel from localhost through http. My browser asks me for 
credentials when opening the Marvel url, so it could be caused by the basic 
authentication setup. Or?

On Tuesday, January 28, 2014 8:01:21 PM UTC+1, Thomas Ardal wrote:
>
> When trying out Marvel on my Elasticsearch installation, I get the error 
> "There were no results because no indices were found that match your 
> selected time span" in the top of the page.
>
> If I understand the documentation, Marvel automatically collects 
> statistics from all indexes on the node. What am I doing wrong?
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/7255ee52-5101-4942-8abd-b29642035237%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


There were no results because no indices were found that match your selected time span

2014-01-28 Thread Thomas Ardal
When trying out Marvel on my Elasticsearch installation, I get the error 
"There were no results because no indices were found that match your 
selected time span" in the top of the page.

If I understand the documentation, Marvel automatically collects statistics 
from all indexes on the node. What am I doing wrong?

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/454e6e01-de1a-4a23-b270-16bf90273c47%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


  1   2   >