Re: Java bulk API slows down if client is not closed and reopened

2014-03-10 Thread Ondřej Spilka
Thanks Joerg, I completely forgot the way of indexing via JSON documents 
I've already done for ES from powershell months ago...

I understand that ES JSON format is very versatile, on the other side, Solr 
compatible option to index plain POCO JSON file which consists only from 
array of objects would be fine in migration from Solr to ES.
There is no problem while ID property can be specified in schema as like as 
in Solr.
So when you have schema with ID property you're on the right way and ES 
have to be able to perform index/update on POCO JSON array.

Then I can imagine one have preprocessor converting XML/CSV or whatever to 
collection schema compatible JSON and search engine can be easily chosen 
between ES and Solr.
It seems like nice idea to me as I'm just a user of search engine...

Ondra


On Monday, March 10, 2014 8:19:20 PM UTC+1, Jörg Prante wrote:
>
> There is a special ES indexing data model, as you surely already have 
> noted. You can only index a subset of valid JSON into ES. For example, each 
> ES JSON doc must be an object. Arrays must be single-valued, unnested. So, 
> arbitrary source JSON must be transformed, and due to the field/value 
> indexing, there is more than one possible model, which depends on your data 
> domain. 
>
> XML is also not straightforward to translate. Attributes and values have 
> to be mapped to JSON fields and there is more than one possibility to do so.
>
> Another question is how to build identifiers from documents for ES doc _id.
>
> In my domain, I transform all my input data (K/V, ISO 2709, JSON, XML) to 
> RDF, create an IRI, and this RDF can be serialized to JSON-LD which fits 
> well into the ES JSON indexing model. YMMV.
>
> Jörg
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1d480a9f-a043-4de7-a3dc-efa2bea14a17%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: DELETE snapshot request (which was long-running and had not yet completed) is hung

2014-03-10 Thread Swaroop CH
 Hi Igor, It seems that the S3 bucket had "PUT only permissions". Regards,Swaroop  10.03.2014, 17:40, "Igor Motov" :That's strange. Wrong S3 permissions should have caused it to failed immediately. Could you provide any more details about the permissions, so I can reproduce it? Meanwhile, restarting the nodes where primary shards of the stuck index are located is the only option that I can think of.  We are working on improving the performance of snapshot cancelation (https://github.com/elasticsearch/elasticsearch/pull/5244), but it didn't make it to a release yet.On Monday, March 10, 2014 4:39:24 AM UTC-4, Swaroop wrote:Hi,  I had started a snapshot request on a freshly-indexed ES 1.0.1 cluster with cloud plugin installed, but unfortunately the EC2 access keys configured did not have S3 permissions, so ES was in a weird state, so I sent a DELETE snapshot request and it's stuck for more than a couple of hours, any advice on what to do here to cleanup the snapshot request? Logs don't reveal anything relevant.  Regards, Swaroop  --  You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c9ef80ee-5512-44bd-b301-54496a31f4b6%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.



-- 
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/1015081394516640%40web27h.yandex.ru.
For more options, visit https://groups.google.com/d/optout.


Re: How Does Elasticsearch Store Data?

2014-03-10 Thread Nii Hung
I'm new to Elasticsearch and recently started playing around with version
1.0.1.  Just saw your question and noticed that the data files at

/data//nodes/0/indices//0/index/segment_1

Note that lucene data is stored in segments.


On Mon, Mar 10, 2014 at 4:47 PM, Marcus Young wrote:

> I'm running Logstash with Redis, Kibana, and Elasticsearch.
>
> All is working well.  But when I look in the configured location for data,
> as set in the elasticsearch.yml file I see nothing.  The directory is empty.
>
> But I can see my data in Kibana and view it in Redis.
>
> So where is Elasticsearch hiding it?
>
> My config setting is: path.data: /var/lib/elasticsearch
>
> ...which is empty.
>
> As I understand it Elasticsearch relies on Lucene which I thought stored
> data in flat files.  If this is the case where are these files?
>
> I need to know this so I can adequately plan storage.
>
> Thanks.
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/7fe3bfb6-869f-42fa-9d84-de8238c78019%40googlegroups.com
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CA%2BpvUGXEzAuD3UV5R5bVXRcuWQzn%2BuYRUy1g2e2wiYTugeUsmg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elasticsearch User Group - 2nd Sydney Meeting 17th March

2014-03-10 Thread Ivan Brusic
Jealous! Two great talks in a great city.

-- 
Ivan


On Mon, Mar 10, 2014 at 8:00 PM, Mark Walkom wrote:

> Hi Everyone,
> We're having another user group meeting next week on Monday in Sydney,
> Australia. If you're interested in coming along then head to our Meetup
> page and RSVP -
> http://www.meetup.com/Elasticsearch-Sydney-Meetup/events/165807792/
>
> This time around we have two talks from Elasticsearch employees who are
> visiting for training; "*Custom Scoring Functions in Elasticsearch"* by
> Britta Weber and "*Introduction to Logstash"* by Lee Hinman. Drink and
> food are catered by our hosts Atlassian.
>
> It'd be great to meet some local ES users, so please come along!
>
> Regards,
> Mark Walkom
>
> Infrastructure Engineer
> Campaign Monitor
> email: ma...@campaignmonitor.com
> web: www.campaignmonitor.com
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CAEM624bNQM9D9hnWvzaaTwJTd47b6iC%3DHFvv969-f6c8mZTEhA%40mail.gmail.com
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQA9JYeFK%2BVE%3DZDE1g8VmeSZYiPihJXxzpG4P6TuSUR%3DEg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Elasticsearch User Group - 2nd Sydney Meeting 17th March

2014-03-10 Thread Mark Walkom
Hi Everyone,
We're having another user group meeting next week on Monday in Sydney,
Australia. If you're interested in coming along then head to our Meetup
page and RSVP -
http://www.meetup.com/Elasticsearch-Sydney-Meetup/events/165807792/

This time around we have two talks from Elasticsearch employees who are
visiting for training; "*Custom Scoring Functions in Elasticsearch"* by
Britta Weber and "*Introduction to Logstash"* by Lee Hinman. Drink and food
are catered by our hosts Atlassian.

It'd be great to meet some local ES users, so please come along!

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEM624bNQM9D9hnWvzaaTwJTd47b6iC%3DHFvv969-f6c8mZTEhA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


elasticsearch for Hadoop library can't read data for array of primitive type in PIG

2014-03-10 Thread Juan Huang
I am using the ElasticSearch for Hadoop library defined in 
"http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/current/pig.html"; 
to read data in PIG for Hadoop.

There is a field "household_ages" defined as "short" in my index type. But 
it is actually an array since this field has multiple values.
PUT /myindex/mytype/_mapping
{
   "mytype": {
  "properties": { 
...
"household_ages": {"type": "short"}  
}
   }
}

An example value of the field "household_ages" is like: [45,40,17,13]

When I run the PIG script, the script can't read the value of the field 
"household_ages", other fields with the primitive type, or map type are 
fine. Only the field with array of primitive can't be read correctly.

Below is the PIG script: 
A = LOAD 'myindex/mytype' USING 
org.elasticsearch.hadoop.pig.EsStorage('es.nodes=localhost:9200');
DUMP A;

The value I got for this column is "(,,,)".

Below is a complete record dumped by PIG script. I can see values are read 
properly for other fields. Only the "household_ages" field can't get the 
data:
(12345,[lname#Smith,fname#John],[state# NY,zip# 0,city#New 
York,street#123 Main 
St],(,,,),[special_clearance#true,weekly_update#true,birthday_greeting#false])

Does any one have any idea how to fix the issue?

Thanks,
Juan


-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/24a4092c-4f11-48d7-9ed8-7d7919e102de%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


How Does Elasticsearch Store Data?

2014-03-10 Thread Marcus Young
I'm running Logstash with Redis, Kibana, and Elasticsearch.

All is working well.  But when I look in the configured location for data, 
as set in the elasticsearch.yml file I see nothing.  The directory is empty.

But I can see my data in Kibana and view it in Redis.

So where is Elasticsearch hiding it?

My config setting is: path.data: /var/lib/elasticsearch

...which is empty.

As I understand it Elasticsearch relies on Lucene which I thought stored 
data in flat files.  If this is the case where are these files?

I need to know this so I can adequately plan storage.

Thanks.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/7fe3bfb6-869f-42fa-9d84-de8238c78019%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


memcached

2014-03-10 Thread Mohit Anchlia
Is anyone using memcached with ES as a plugin. I am wondering if it
provides any benefits over the caching in heap.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAOT3TWqNKb7C9zXmMf%2B4n-x7b68%2BPJo9ecmeSiqEF_dUKf9hdA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Plugin for Multilingual Text Analytics

2014-03-10 Thread jandrews
Check out the new multilingual search plugin for Elasticsearch:
http://www.basistech.com/elasticsearch/
(tokenization, lemmatization, decompounding, Noun Phrase Extraction,  POS 
tagging, along with entity extraction and entity resolution in Asian, 
European and Middle Eastern languages.)

Installs in 6 simple commands:
Download and Install the RSE-ES plugin.
Navigate to the elasticsearch-0.90.10 directory and run

bin/plugin --install analysis-rse --url 
http://download.basistech.com/httpFDL/elasticsearch-analysis-rse/latest


The RSE-ES plugin is now in plugins/analysis-rse.

To start the Elasticsearch search server, run

bin/elasticsearch -f

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/32b41004-8d2b-4321-922c-bae8724c6651%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Marvel strategy to data store

2014-03-10 Thread Mark Walkom
You can use the Sense dashboard in Marvel to run the curl.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com


On 10 March 2014 19:29, David Pilato  wrote:

> You can not delete from Marvel yet.
>
> But you can run
>
> curl -XDELETE http://localhost:9200/.marvel-
>
> or use curator:
>
> https://github.com/elasticsearch/curator
>
>  --
> *David Pilato* | *Technical Advocate* | *Elasticsearch.com*
> @dadoonet  | 
> @elasticsearchfr
>
>
> Le 10 mars 2014 à 09:27:25, Shilpi Agrawal (shil...@strumsoft.com) a
> écrit:
>
> How can we delete the data from marvel.
> some Instruction or commands would be helpful.
>
> Thanks
>
> On Monday, March 10, 2014 12:13:22 PM UTC+5:30, Shilpi Agrawal wrote:
>>
>> Hi,
>>
>> I want to know the strategy which marvel follows to store the data.
>> Like for how long it store the data and how it flushes the data and how
>> much data can be stored any limitations.
>>
>> how it manages the data which is stored via any cluster.
>>
>> Please help me to let me know this.
>>
>> Thanks
>>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/b2d3c23d-21aa-4ffc-a875-befdb89b2f14%40googlegroups.com
> .
> For more options, visit https://groups.google.com/d/optout.
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/etPan.531d77e0.238e1f29.b095%40MacBook-Air-de-David.local
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEM624YnnK2C5aM7rTvo-y0-%3DU1pJYZ2V%2BJZ20Z8SQWzsqrtsw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: multi_field sort and search issue

2014-03-10 Thread James Reynolds
Accidentally hit submit early.

I have a very similar mapping where multi_field is working fine. All the 
mapping information is on the SO thread as well as explains and other 
results.

Any help that any of you can provide, I would greatly appreciate.



On Monday, March 10, 2014 5:07:03 PM UTC-4, James Reynolds wrote:
>
> I have a question up on SO here:
>
> For some reason, I can't sort or search in a multi_field in one particular 
> index:
>
>
> http://stackoverflow.com/questions/22305891/elasticsearch-multi-field-type-search-and-sort-issue
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/6f218489-9bda-4632-85b9-af2964a74316%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


multi_field sort and search issue

2014-03-10 Thread James Reynolds
I have a question up on SO here:

For some reason, I can't sort or search in a multi_field in one particular 
index:

http://stackoverflow.com/questions/22305891/elasticsearch-multi-field-type-search-and-sort-issue

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/cc2af6b7-f819-4be6-9180-3a0ca7fcef72%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Authentication again

2014-03-10 Thread Martin Hátaš
Thank you very much for answer. 
I think that elasticsearch-jetty plugin handles only http request, am I 
right? (REST API requests)
I have alrery disable http by *http.enabled: false* option and I want to 
secure comunication amog the nodes.
For access to the cluster I use JAVA API. For me it´s not clear 
if elasticsearch-jetty can encrypt this comunication.

Dne pondělí, 10. března 2014 18:05:35 UTC+1 Binh Ly napsal(a):
>
> Perhaps you can give this a try:
>
> https://github.com/sonian/elasticsearch-jetty
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/301cfea2-9ddb-4743-b517-629127e9941c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: TransportClient timeout / webserver configuration - JAVA Api

2014-03-10 Thread joergpra...@gmail.com
Check the DNS of the ES servers. Netty initializes DNS services to get the
publish address, which is used for setting the bound address, and this
network call seems to take a long time (>30secs) so the bound address is
null for a while.

If your DNS or network is misconfigured, a DNS call will stall for a long
time. During this time, no node ping request/response should occur, since
this will trigger an NPE in NettyHttpServerTransport.

You could also increase the default node ping timeout
discovery.zen.ping.timeout which is by default 30 secs (already very high).

Another fix might be, I have all my ES servers names/IPs also registered in
/etc/hosts, so OS-based server name resolution can not play tricks on Java.

Jörg


On Mon, Mar 10, 2014 at 6:18 PM, Robert Langenfeld
wrote:

> Thomas,
>
> We can connect and do queries and index processes just fine. The problem
> we are experiencing is that after a long time of idle (15 minutes or so)
> doing nothing, when we go to perform another operation this occurs.
>
>
> On Monday, March 10, 2014 8:46:53 AM UTC-5, Thomas wrote:
>>
>> As per the documentation Client is threadsafe and is suggested by the
>> elasticsearch team to use the same client instance across your app.
>> Considering your exception above you might need to look your configuration
>> first (like cluster name and host/port) and you should use port 9300 for
>> the Java API. Finally, check whether you are under a firewall or something
>> and you block 9300 port.
>>
>> Hope it helps
>> Thomas
>>
>> On Monday, 10 March 2014 15:42:25 UTC+2, Robert Langenfeld wrote:
>>>
>>> Hello,
>>>
>>> I'm developing a tomcat webserver application that uses ElasticSearch
>>> 1.0 (Java API). There is a client facing desktop application that
>>> communicates with the server so all the code for ElasticSearch is on that
>>> one instance and it is used by all our clients. With that being said I am
>>> running into this issue: After initializing a new TransportClient object
>>> and performing some operation on it, there is a chance that i could sit
>>> idle for a very long time. When does sit idle for a long time it gets this
>>> error:
>>>
>>>
>>> Mar 08, 2014 1:15:37 AM org.elasticsearch.client.transport
>>>
>>> INFO: [Elven] failed to get node info for [#transport#-1][WIN7-113-
>>> 00726][inet[/159.140.213.87:9300]], disconnecting...
>>>
>>> org.elasticsearch.transport.RemoteTransportException:
>>> [Server_Dev1][inet[/159.140.213.87:9300]][cluster/nodes/info]
>>>
>>> Caused by: java.lang.NullPointerException
>>>
>>> at org.elasticsearch.http.HttpInfo.writeTo(HttpInfo.java:82)
>>>
>>> at org.elasticsearch.action.admin.cluster.node.info.
>>> NodeInfo.writeTo(NodeInfo.java:301)
>>>
>>> at org.elasticsearch.action.admin.cluster.node.info.
>>> NodesInfoResponse.writeTo(NodesInfoResponse.java:63)
>>>
>>> at org.elasticsearch.transport.netty.NettyTransportChannel.sendResponse(
>>> NettyTransportChannel.java:83)
>>>
>>> at org.elasticsearch.action.support.nodes.TransportNodesOperationAction$
>>> TransportHandler$1.onResponse(TransportNodesOperationAction.java:244)
>>>
>>> at org.elasticsearch.action.support.nodes.TransportNodesOperationAction$
>>> TransportHandler$1.onResponse(TransportNodesOperationAction.java:239)
>>>
>>> at org.elasticsearch.action.support.nodes.TransportNodesOperationAction$
>>> AsyncAction.finishHim(TransportNodesOperationAction.java:225)
>>>
>>> at org.elasticsearch.action.support.nodes.TransportNodesOperationAction$
>>> AsyncAction.onOperation(TransportNodesOperationAction.java:200)
>>>
>>> at org.elasticsearch.action.support.nodes.TransportNodesOperationAction$
>>> AsyncAction.access$900(TransportNodesOperationAction.java:102)
>>>
>>> at org.elasticsearch.action.support.nodes.TransportNodesOperationAction$
>>> AsyncAction$2.run(TransportNodesOperationAction.java:146)
>>>
>>> at java.util.concurrent.ThreadPoolExecutor.runWorker(
>>> ThreadPoolExecutor.java:1145)
>>>
>>> at java.util.concurrent.ThreadPoolExecutor$Worker.run(
>>> ThreadPoolExecutor.java:615)
>>>
>>> at java.lang.Thread.run(Thread.java:744)
>>>
>>> Is there any way to prevent this from happening? I know the ideal
>>> situation would be that after every request the transport client is closed.
>>> But since it lives on a webserver with lots of search requests coming in,
>>> we would ideally like it to stay open because it takes 3-4 seconds for a
>>> transport client to initialize and we are going for speed here.
>>>
>>> Also since we are having one central server to handle all search and
>>> index requests, can the TransportClient handle multiple simultaneous
>>> requests from different users at the same time? We just want to make sure
>>> that we are doing this correctly.
>>>
>>>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on t

Re: help with setting up mappings (Perl)

2014-03-10 Thread Clinton Gormley
Hi Dom

You need to consult the Elasticsearch documentation for mapping. See
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-core-types.html#string

What you're looking for is:


properties => {
somefield => {
type => 'string',
index => 'not_analyzed'
}
}


Clint


On 10 March 2014 18:45, Dominic Nicholas wrote:

> Hi Clinton,
>
> I really appreciate the fast reply. We're now using Search::Elasticsearch.
> I'm still having a problem getting mappings set and hope you can help.
> What I'm trying to do is turn of the analyzer for certain fields. Here is
> what I have :
>
> In Perl :
>
> %mappings = (
> 'index' => 'idx-2014.03.10',
> 'type'  => 'my_type',
> 'body' => {
> 'my_type' => {
> "properties" => {
> 'somefield' =>
> 'not_analyzed'
>  }
> }
>   }
> );
>
> eval { $es_result = $elastic_search_object->indices->put_mapping(
> \%mappings ) ; };
> etc etc
>
> The put_mapping call does not return an error.
>
> In JSON format, the mappings hash is :
>
> {
>"index" : "idx-2014.03.10",
>"type" : "my_type"
>"body" : {
>   "my_type" : {
>  "properties" : {
> "somefield" : "not_analyzed"
>  }
>   }
>}
> }
>
> When I try to get the mappings for the this type/index with :
>
> curl -XGET 'localhost:9200/idx-2014.03.10/my_type/_mapping'
>
> I get {} which I think means there are no mappings yet set for this index
> and type.
>
> Is the body section above of the correct structure ? Is 'properties'
> always required ? Any guidance would be much appreciated again.
>
> I've been using these as guides so far :
> -
> https://metacpan.org/pod/Search::Elasticsearch::Client::Direct::Indices#put_mapping
> -
> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-put-mapping.html
> - some training docs
>
> Thanks again
> Dom
>
>
> On Monday, March 10, 2014 10:35:21 AM UTC-4, Dominic Nicholas wrote:
>>
>> Hi,
>>
>> I'm using the Elasticsearch Perl module and need guidance on setting up
>> mappings.
>> I'm using the bulk() method to index data. Here is an example of the
>> structure of the data :
>>
>>   $response = $e->bulk(
>>  "index" : "idx-2014.03.10",
>>  "type" : "my_type",
>>  "body" : [
>> {
>> "index" : {
>> "_index" : "idx-2014.03.10",
>> "_id" : "4410",
>> "_type" : "my_type"
>> }
>> },
>> {
>> "something" : "interesting",
>> "somethingelse" : "also interesting"
>> },
>> {
>> "index" : {
>> "_index" : "idx-2014.03.10",
>> "_id" : "4411",
>> "_type" : "my_type"
>> }
>> },
>> {
>> "something" : "very interesting",
>> "somethingelse" : "not interesting"
>> }
>>  ]
>>   );
>>
>> How do I set up mappings on various fields in the above example for
>> 'something' and 'somethingelse' fields ?
>> Also, how do I turn off the analyzer for an index (index: not_analyzed)
>>  too ?
>>
>> I know there are several ways of setting up mappings such as :
>>
>> - when creating an index
>> - by using the dedicated update mapping api
>> - using index templates
>>
>> Ideally I'd like to use the dedicated update mapping api but am unclear
>> how to use that through the Perl library interface (eg use
>> transport->perform_request() ?).
>>
>> Thanks for any guidance and help.
>>
>> Dom
>>
>>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/f23f9a91-3176-4bea-8258-c9d38d1dd34f%40googlegroups.com
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google

Re: Java bulk API slows down if client is not closed and reopened

2014-03-10 Thread joergpra...@gmail.com
There is a special ES indexing data model, as you surely already have
noted. You can only index a subset of valid JSON into ES. For example, each
ES JSON doc must be an object. Arrays must be single-valued, unnested. So,
arbitrary source JSON must be transformed, and due to the field/value
indexing, there is more than one possible model, which depends on your data
domain.

XML is also not straightforward to translate. Attributes and values have to
be mapped to JSON fields and there is more than one possibility to do so.

Another question is how to build identifiers from documents for ES doc _id.

In my domain, I transform all my input data (K/V, ISO 2709, JSON, XML) to
RDF, create an IRI, and this RDF can be serialized to JSON-LD which fits
well into the ES JSON indexing model. YMMV.

Jörg

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGi2dOE43tCb2%3DkUaGhmUj5Z_yt-qM2%2B%3DNa%2BM%3D-VqmBhw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Insertion Spikes

2014-03-10 Thread Parag Shah
Hi all,

   As part of our performance exercize, we have been trying characterize 
the Insertion performance of ElasticSearch (0.90.7). Here is our setup:

*Nodes:* 3 AWS m1.xlarge (16G)
*Memory:* 8G Heap on each node.

*Indices:* 5 aliases, 3 indexes per alias, 2 shards per index. (30 shards), 
1 replica.
*Total: *60 shards across 3 machines in the cluster.

*Client Nodes:* 3 AWS m1.large (8G)
*No. of threads per client:* 50
*Memory:* 1 G (heap)
*Overall:* 1.4 M documents
*Avg Doc size:*  4K (Most messages were 500 - 1500 bytes, there were a 
bunch of message >8K)

The graph below shows insertion spikes at various points. We tried to 
correlate them with the merge times, but that did not seem to hold true. 
There also seems to be no direct correlation between message size and 
insertion times.



Insertion times have been plotted at 95th percentile. 
Below is a plot for the size of messages:


My question is: Is there something we are doing wrong in here or is there a 
way to explain why the spikes occur for insertion times?

Any help will be appreciated.

Regards
Parag

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/665fe952-964b-4cf4-8ef8-fab95a529cc1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Accessing date field in a native (Java) script

2014-03-10 Thread Binh Ly
If you do something like this, you should get the epoch value in 
milliseconds. Then you can use that value to initialize whatever object you 
want:

  ScriptDocValues v = (ScriptDocValues) doc().get(dateField);
  if (v != null && !v.isEmpty()) {
long epoch_ms = ((Longs)v).getValue();
  }

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/f746d44b-f9dd-4c65-a4af-f653fb8e86c8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: ElasticSearch - how to SUM function_score by field

2014-03-10 Thread Jay Hilden
Thanks for the response Binh.  Is there a way to pipe the response from my
first query directly into a new index without round tripping the data back
to the server that requested the query?


On Mon, Mar 10, 2014 at 1:09 PM, Binh Ly  wrote:

> Unfortunately, subqueries are not supported. What you can do is dump the
> results of your first query into an index. And then you can do a
> terms_stats facet on that dump to get your final results:
>
>
> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-facets-terms-stats-facet.html
>
> --
> You received this message because you are subscribed to a topic in the
> Google Groups "elasticsearch" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/elasticsearch/hiTA1hsdLr8/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/04770395-4c81-4d42-9566-df763a39fb0c%40googlegroups.com
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAD3qxy56ohK9icugkoo2vebJkyXLNZfBpQterFO2ib7zU1aXFQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: No results from suggest query

2014-03-10 Thread David Register
I updated the gist to reflect the answer.

https://gist.github.com/dmregister/9467385

It turns out, the index_analyzer and search_analyzer "simple" do not work 
well with numbers. Changing them to "keyword" did the trick.

On Friday, March 7, 2014 7:09:39 PM UTC-5, David Register wrote:
>
> The query stopped working after I upgraded to 1.0.1 from 0.90.5. I am 
> using PHP and also updated my library from 0.4 to 1.0. 
>
> I am trying to bulk index my data, I am wondering if that could be the 
> problem.
>
> I am providing a gist with all the current info about this situation.
>
> https://gist.github.com/dmregister/efc07f4f4ffc2a96b4d2
>
> Any help would be greatly appreciated.
>
> Thanks.
> David
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/39cb2576-0eac-424f-912c-5dbfd46d0e70%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: ElasticSearch - how to SUM function_score by field

2014-03-10 Thread Binh Ly
Unfortunately, subqueries are not supported. What you can do is dump the 
results of your first query into an index. And then you can do a 
terms_stats facet on that dump to get your final results:

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-facets-terms-stats-facet.html

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/04770395-4c81-4d42-9566-df763a39fb0c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Can a replica be updated with the deltas only?

2014-03-10 Thread Binh Ly
I stand corrected. Clint is right. ES will try to apply only diffs as much 
as possible at the segment level. But if your underlying segments have 
diverged significantly since the replica node went down, it is likely that 
you'll end up with copying a lot more than the diffs (document-wise). 
Otherwise, it'll just copy segments that have changed. :)

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/3557d64f-b326-4776-ba8e-42e923d2b30a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: help with setting up mappings (Perl)

2014-03-10 Thread Dominic Nicholas
Hi Clinton,

I really appreciate the fast reply. We're now using Search::Elasticsearch. 
I'm still having a problem getting mappings set and hope you can help.
What I'm trying to do is turn of the analyzer for certain fields. Here is 
what I have :

In Perl :

%mappings = (  
'index' => 'idx-2014.03.10',
'type'  => 'my_type',
'body' => {
'my_type' => {
"properties" => {
'somefield' => 
'not_analyzed'
 }
}
  }
);

eval { $es_result = $elastic_search_object->indices->put_mapping( 
\%mappings ) ; };
etc etc

The put_mapping call does not return an error. 

In JSON format, the mappings hash is :

{
   "index" : "idx-2014.03.10",
   "type" : "my_type"
   "body" : {
  "my_type" : {
 "properties" : {
"somefield" : "not_analyzed"
 }
  }
   }
}

When I try to get the mappings for the this type/index with :

curl -XGET 'localhost:9200/idx-2014.03.10/my_type/_mapping'

I get {} which I think means there are no mappings yet set for this index 
and type.

Is the body section above of the correct structure ? Is 'properties' always 
required ? Any guidance would be much appreciated again.

I've been using these as guides so far :
- 
https://metacpan.org/pod/Search::Elasticsearch::Client::Direct::Indices#put_mapping
- 
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-put-mapping.html
- some training docs

Thanks again
Dom


On Monday, March 10, 2014 10:35:21 AM UTC-4, Dominic Nicholas wrote:
>
> Hi,
>
> I'm using the Elasticsearch Perl module and need guidance on setting up 
> mappings.
> I'm using the bulk() method to index data. Here is an example of the 
> structure of the data :
>
>   $response = $e->bulk(
>  "index" : "idx-2014.03.10",
>  "type" : "my_type",
>  "body" : [
> {
> "index" : {
> "_index" : "idx-2014.03.10",
> "_id" : "4410",
> "_type" : "my_type"
> }
> },
> {
> "something" : "interesting",
> "somethingelse" : "also interesting"
> },
> {
> "index" : {
> "_index" : "idx-2014.03.10",
> "_id" : "4411",
> "_type" : "my_type"
> }
> },
> {
> "something" : "very interesting",
> "somethingelse" : "not interesting"
> }
>  ]
>   );
>
> How do I set up mappings on various fields in the above example for 
> 'something' and 'somethingelse' fields ?
> Also, how do I turn off the analyzer for an index (index: not_analyzed) 
>  too ?
>
> I know there are several ways of setting up mappings such as :
>
> - when creating an index 
> - by using the dedicated update mapping api
> - using index templates
>
> Ideally I'd like to use the dedicated update mapping api but am unclear 
> how to use that through the Perl library interface (eg use 
> transport->perform_request() 
> ?).
>
> Thanks for any guidance and help.
>
> Dom
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/f23f9a91-3176-4bea-8258-c9d38d1dd34f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: BigDecimal support

2014-03-10 Thread mooky
Righto - I will try add some.
-Nick

On Wednesday, 5 March 2014 13:48:58 UTC, Jörg Prante wrote:
>
> Yes, there are no tests yet.
>
> Jörg
>
>
> On Wed, Mar 5, 2014 at 2:24 PM, mooky  >wrote:
>
>> I am ready to create a pull request - its actually quite a simple change.
>> However, I cant find *any *tests for the existing BigDecimal support ... 
>> does that sound right?
>>
>> -Nick
>>
>>
>>
>> On Friday, 28 February 2014 12:09:00 UTC, mooky wrote:
>>>
>>> XContentBuilder has support for BigDecimal, but:
>>>
>>>1. If you pass the source as a Map when indexing, the BigDecimal 
>>>handling doesn't get invoked (https://github.com/
>>>
>>> elasticsearch/elasticsearch/issues/5260
>>>).
>>>2. The existing handling should delegate through to Jackson's 
>>>handling of BigDecimal (which can be configured to serialise BigDecimal 
>>> in 
>>>a lossless fashion - I dont think that feature existed when I had to 
>>> worry 
>>>about it last) 
>>>
>>> Looking at the code now, I think its actually an easy change - I will 
>>> see if I can create a pull request.
>>>
>>> -Nick
>>>
>>>
>>> On Wednesday, 26 February 2014 17:28:29 UTC, Jörg Prante wrote:

 ES accepts BigDecimal input. You can specify scale and rounding mode to 
 format the BigDecimal. 

 https://github.com/jprante/elasticsearch/commit/
 8ef8cd149b867e3e45bc3055dfd6da80e4e9c7ec

 Internally, BigDecimal is automatically converted to a JSON string if 
 the number does not fit into double format. Because numbers are useful in 
 Lucene for range searches, they have an advantage.

 But I agree, another option could be to enforce string conversion in 
 any case, for example storing currency values as strings for financial 
 services, without arithmetic operations in the index.

 Maybe the toEngineeringString() was not a smart decision and 
 toPlainString() works better.

 So I would welcome improvements, or should I suggest one in a pull 
 request?

 Jörg



 On Wed, Feb 26, 2014 at 6:05 PM, mooky  wrote:

> In financial services space, we almost never use float/double in our 
> domain - we always use BigDecimal.
>
> In elastic, I would like to be able to index/store BigDecimal in a 
> lossless manner (ie what I get back from _source has the same precision, 
> etc as what I put in).
>
> When I have had to preserve the json serialisation of BigDecimal, I 
> have usually had custom serialiser/deserialisers that printed it out as a 
> json number - but whose textual value was toPlainString(). When 
> deserialising, creating the BigDecimal with the string value (e.g. 
> '42.5400') maintained the precision that was originally serialised
> e.g.
>
> {
>   verySmallNumber : 0.012000,
>   otherNumber : 42.5400
> }
>
> Perhaps elastic could index bigdecimal as a double - but store it in 
> the source in a lossless fashion.
> It would require a user setting, I guess, to treat all floating point 
> numbers as BigDecimal.
>
> Thoughts?
>
> -- 
> You received this message because you are subscribed to the Google 
> Groups "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send 
> an email to elasticsearc...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/elasticsearch/b54dfd5a-3a0e-4946-aa5f-28b3794a92ac%
> 40googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>

  -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/b8463a21-c997-4269-ae52-992caae88ced%40googlegroups.com
>> .
>>
>> For more options, visit https://groups.google.com/groups/opt_out.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/d56144d1-3e0d-46d2-9ff8-a2cadc9b8344%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: TransportClient timeout / webserver configuration - JAVA Api

2014-03-10 Thread Robert Langenfeld
Thomas,

We can connect and do queries and index processes just fine. The problem we 
are experiencing is that after a long time of idle (15 minutes or so) doing 
nothing, when we go to perform another operation this occurs.

On Monday, March 10, 2014 8:46:53 AM UTC-5, Thomas wrote:
>
> As per the documentation Client is threadsafe and is suggested by the 
> elasticsearch team to use the same client instance across your app. 
> Considering your exception above you might need to look your configuration 
> first (like cluster name and host/port) and you should use port 9300 for 
> the Java API. Finally, check whether you are under a firewall or something 
> and you block 9300 port.
>
> Hope it helps
> Thomas
>
> On Monday, 10 March 2014 15:42:25 UTC+2, Robert Langenfeld wrote:
>>
>> Hello,
>>
>> I'm developing a tomcat webserver application that uses ElasticSearch 1.0 
>> (Java API). There is a client facing desktop application that communicates 
>> with the server so all the code for ElasticSearch is on that one instance 
>> and it is used by all our clients. With that being said I am running into 
>> this issue: After initializing a new TransportClient object and performing 
>> some operation on it, there is a chance that i could sit idle for a very 
>> long time. When does sit idle for a long time it gets this error:
>>
>>
>> Mar 08, 2014 1:15:37 AM org.elasticsearch.client.transport
>>
>> INFO: [Elven] failed to get node info for 
>> [#transport#-1][WIN7-113-00726][inet[/159.140.213.87:9300]], 
>> disconnecting...
>>
>> org.elasticsearch.transport.RemoteTransportException: 
>> [Server_Dev1][inet[/159.140.213.87:9300]][cluster/nodes/info]
>>
>> Caused by: java.lang.NullPointerException
>>
>> at org.elasticsearch.http.HttpInfo.writeTo(HttpInfo.java:82)
>>
>> at 
>> org.elasticsearch.action.admin.cluster.node.info.NodeInfo.writeTo(NodeInfo.java:301)
>>
>> at 
>> org.elasticsearch.action.admin.cluster.node.info.NodesInfoResponse.writeTo(NodesInfoResponse.java:63)
>>
>> at 
>> org.elasticsearch.transport.netty.NettyTransportChannel.sendResponse(NettyTransportChannel.java:83)
>>
>> at 
>> org.elasticsearch.action.support.nodes.TransportNodesOperationAction$TransportHandler$1.onResponse(TransportNodesOperationAction.java:244)
>>
>> at 
>> org.elasticsearch.action.support.nodes.TransportNodesOperationAction$TransportHandler$1.onResponse(TransportNodesOperationAction.java:239)
>>
>> at 
>> org.elasticsearch.action.support.nodes.TransportNodesOperationAction$AsyncAction.finishHim(TransportNodesOperationAction.java:225)
>>
>> at 
>> org.elasticsearch.action.support.nodes.TransportNodesOperationAction$AsyncAction.onOperation(TransportNodesOperationAction.java:200)
>>
>> at 
>> org.elasticsearch.action.support.nodes.TransportNodesOperationAction$AsyncAction.access$900(TransportNodesOperationAction.java:102)
>>
>> at 
>> org.elasticsearch.action.support.nodes.TransportNodesOperationAction$AsyncAction$2.run(TransportNodesOperationAction.java:146)
>>
>> at 
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>
>> at 
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>
>> at java.lang.Thread.run(Thread.java:744)
>>
>> Is there any way to prevent this from happening? I know the ideal 
>> situation would be that after every request the transport client is closed. 
>> But since it lives on a webserver with lots of search requests coming in, 
>> we would ideally like it to stay open because it takes 3-4 seconds for a 
>> transport client to initialize and we are going for speed here.
>>
>> Also since we are having one central server to handle all search and 
>> index requests, can the TransportClient handle multiple simultaneous 
>> requests from different users at the same time? We just want to make sure 
>> that we are doing this correctly.
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/760f6a9f-a917-494e-9328-c1b3e9877cae%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Fields with same name mapped differently in different types

2014-03-10 Thread morninlark
Thank you for your answer.
Renaming would create other problems for me, so I went with putting the
types in different indexes.



--
View this message in context: 
http://elasticsearch-users.115913.n3.nabble.com/Fields-with-same-name-mapped-differently-in-different-types-tp4051423p4051447.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1394472077597-4051447.post%40n3.nabble.com.
For more options, visit https://groups.google.com/d/optout.


Re: Authentication again

2014-03-10 Thread Binh Ly
Perhaps you can give this a try:

https://github.com/sonian/elasticsearch-jetty

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/b457c90d-1b4d-4a79-a051-5751e2609dfe%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Unable to load script under config/scripts

2014-03-10 Thread Binh Ly
Here's a basic example you can try:

https://gist.github.com/bly2k/9468905

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/7b06ee42-7bd1-41ab-a2eb-560949a86158%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Java bulk API slows down if client is not closed and reopened

2014-03-10 Thread Ondřej Spilka
Thanks for tips.Yes I am reusing requestbuilder as stated in example in docs so 
this can be the case.
I will try to reinstantiate the request builder and I will let you know.

Btw is there a way how to simply bulk index json/xml file as like as in Solr? 
This is extremely useful feature isolating large document preprocessing and 
indexing...

Thanks for support.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c98bb9a5-9ac5-41a5-b553-a7c948289739%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Can a replica be updated with the deltas only?

2014-03-10 Thread Clinton Gormley
If a replica recovers from the primary, then the node hosting it is shut
down shortly thereafter, when it comes back up it will only copy the
segments that have changed in the interim period.

However, merges happen independently on the primary and the replica. When a
replica has been running for a long time, its segments will have diverged
from those of the primary, and so more segments need to be copied across.

In the future we hope to improve this process.


On 10 March 2014 17:11, Binh Ly  wrote:

> I don't believe this is possible. ES does sync replication by default and
> then when a replica is down while updates/inserts are coming in, that
> replica is simply invalidated and then fully recovered later once it comes
> back up.
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/a9734310-58d9-4f0f-bc98-f7ef9bf7bd29%40googlegroups.com
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAPt3XKR1By1eO61z7Knju4_QAQdUEjT0t5_4LAh%3D%2BXNkAUVkmw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Fields with same name mapped differently in different types

2014-03-10 Thread Binh Ly
Unfortunately, no you can't do this at the moment. You'd have to make all 
your field types the same, or use different field names. If they are all in 
the same index in different types with different data types and same field 
names, you'll unfortunately encounter this limitation.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/e0c06fab-1cba-4c89-8861-e905f6a8338f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Is it possible to isolate search quesries to a single node

2014-03-10 Thread Binh Ly
You can dynamically add (or reduce) a replica on an index, like:

PUT localhost:9200/index/_settings
{ "index.number_of_replicas": 1 }

You can also force the _search to go to specific areas in your cluster 
using the preference parameter:

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-preference.html

Using the right cluster shard allocation awareness strategy, you might be 
able to achieve forcing a replica to wherever you want:

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-cluster.html#shards-allocation

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/2144a720-d4b0-4740-bfd8-f71fa9403fb0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: fielddata breaker question

2014-03-10 Thread Dunaeth
>From now, I'd say the field data size is quite flat whereas the jvm heap 
used grows as fielddata_breaker.estimated_size_in_bytes grows. I'll post 
graphs when they'll be relevant.
If it's a jvm heap used issue, could it be due to some kind of caching 
issue (though the filter_cache seems small on each shard) ?

Le lundi 10 mars 2014 09:43:56 UTC+1, Dunaeth a écrit :
>
> I'm asking our hoster to monitor these metrics and to avoid any confusion, 
> the breaker indice size actually monitor the 
> fielddata_breaker.estimated_size_in_bytes from the /_nodes/stats endpoint. 
> Thanks for following this thread :)
>
> Le lundi 10 mars 2014 09:34:15 UTC+1, Martijn v Groningen a écrit :
>>
>> Yes, the breaker indices size does grow quickly. Can you share the same 
>> graphs for jvm heap used and field data size? 
>>
>>
>> On 10 March 2014 15:16, Dunaeth  wrote:
>>
>>> Hi,
>>>
>>> In order to be more precise, here are the graphs of the metrics we 
>>> monitor since we've had the fielddata breaker issue :
>>>
>>>
>>> 
>>>
>>>
>>> 
>>>
>>>
>>> 
>>>
>>> As one can see, the indices grow kind of linearly with a size which 
>>> remains relatively small when the fielddata breaker estimated size grows 
>>> exponentially.
>>>
>>> Le jeudi 6 mars 2014 14:49:04 UTC+1, Dunaeth a écrit :
>>>
 At the moment, we have a whole index size of less than 100MB (less than 
 200MB with backuped data) and the estimated_size is 1.4GB... How are we 
 supposed to deal we that kind of trouble ?

 Le mardi 4 mars 2014 06:50:56 UTC+1, Dunaeth a écrit :
>
> Isn't it a bit weird that we reached a 800MB limit and shortcircuited 
> the data processing when our whole indices size is only 140MB (half this 
> size actually since it includes a backup node) ?

  -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to elasticsearc...@googlegroups.com.
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/elasticsearch/73526a94-b16b-48b6-9a27-526b194b145f%40googlegroups.com
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>
>>
>> -- 
>> Met vriendelijke groet,
>>
>> Martijn van Groningen 
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/8366c5bd-d9e3-49be-9ceb-12190625af1d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Can a replica be updated with the deltas only?

2014-03-10 Thread Binh Ly
I don't believe this is possible. ES does sync replication by default and 
then when a replica is down while updates/inserts are coming in, that 
replica is simply invalidated and then fully recovered later once it comes 
back up.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a9734310-58d9-4f0f-bc98-f7ef9bf7bd29%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Is it possible to isolate search quesries to a single node

2014-03-10 Thread Garry Welding
You're looking for custom 
routing: 
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-routing-field.html

Some good blog posts on it here 
too: http://exploringelasticsearch.com/book/advanced-techniques/routing.html 
& http://www.elasticsearch.org/blog/customizing-your-document-routing/

On Monday, March 10, 2014 3:39:06 PM UTC, Yves Dorfsman wrote:
>
> I have a job that makes heavy use to ES, to the point that it affects the 
> cluster. Is it possible to:
>
>   - add a replica
>   - force the extra replica to a specific node
>   - isolate some of the queries to that particular node?
>
> Thanks.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/b4ab46f3-b77b-41c6-b9e4-a405c2361cef%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Can a replica be updated with the deltas only?

2014-03-10 Thread Yves Dorfsman
When I shutdown a node that holds a replica and updates are happening to 
the rest of the cluster, then re-start this node, it seems that the entire 
replica is being copied again to that node.

Is there a way to make ES just update that node with the updates that 
happened while it was down?


Thanks.


-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/2d6138a0-5b4d-4ab4-9ef8-2f94beaef241%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Is it possible to isolate search quesries to a single node

2014-03-10 Thread Yves Dorfsman
I have a job that makes heavy use to ES, to the point that it affects the 
cluster. Is it possible to:

  - add a replica
  - force the extra replica to a specific node
  - isolate some of the queries to that particular node?

Thanks.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/2815f221-4828-4382-a246-97973cd98709%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: help with setting up mappings (Perl)

2014-03-10 Thread Clinton Gormley
Hi Dom

First, make sure you're using the new Search::Elasticsearch client
https://metacpan.org/pod/Search::Elasticsearch - we've just renamed it to
avoid namespace clashes with older clients.

Then: to configure the mapping yourself, you need to do it before you index
any data (using the bulk method or any other indexing method).

The full Elasticsearch REST API is supported by the Perl client, eg:

1) at index creation time:

$es->indices->create(
index => 'myindex',
body => {
mappings=> {
mytype => {...mapping definition...}
}
 }
);

See
https://metacpan.org/pod/Search::Elasticsearch::Client::Direct::Indices#create

2) Update mapping API (note, you can't change existing mappings, just add
new fields/types):

$es->indices->put_mapping(
index => 'myindex',
type => 'mytype',
body => { ... mapping definition ... }
)

See
https://metacpan.org/pod/Search::Elasticsearch::Client::Direct::Indices#put_mapping

clint


On 10 March 2014 15:35, Dominic Nicholas wrote:

> Hi,
>
> I'm using the Elasticsearch Perl module and need guidance on setting up
> mappings.
> I'm using the bulk() method to index data. Here is an example of the
> structure of the data :
>
>   $response = $e->bulk(
>  "index" : "idx-2014.03.10",
>  "type" : "my_type",
>  "body" : [
> {
> "index" : {
> "_index" : "idx-2014.03.10",
> "_id" : "4410",
> "_type" : "my_type"
> }
> },
> {
> "something" : "interesting",
> "somethingelse" : "also interesting"
> },
> {
> "index" : {
> "_index" : "idx-2014.03.10",
> "_id" : "4411",
> "_type" : "my_type"
> }
> },
> {
> "something" : "very interesting",
> "somethingelse" : "not interesting"
> }
>  ]
>   );
>
> How do I set up mappings on various fields in the above example for
> 'something' and 'somethingelse' fields ?
> Also, how do I turn off the analyzer for an index (index: not_analyzed)
>  too ?
>
> I know there are several ways of setting up mappings such as :
>
> - when creating an index
> - by using the dedicated update mapping api
> - using index templates
>
> Ideally I'd like to use the dedicated update mapping api but am unclear
> how to use that through the Perl library interface (eg use 
> transport->perform_request()
> ?).
>
> Thanks for any guidance and help.
>
> Dom
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/f310961e-c389-49ea-82c9-c47d71d32209%40googlegroups.com
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAPt3XKQhLPNsGhugoaYqWisBuKRp73oD%2BHcRDi2GE85iH8MrpA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Kibana Histogram unexpected line curve with cumulative value

2014-03-10 Thread Isaac Hazan


Kibana cannot do a the Histogram of the cumulative value of a field as 
describe at:https://github.com/elasticsearch/kibana/issues/740

To overcome that I created a separate index where I calculate myself the 
total and saved it to Elasticsearch.

The mapping looks as follows:

curl -XPOST localhost:9200/first_install -d '{
"settings" : {
"number_of_shards" : 5
},
"mappings" : {
"fi" : {
"properties" : {
"evtTime" : { "type" : "date", "index": "not_analyzed", 
"format": "dd/MMM/:HH:mm:ss" },
"cumulativeValue" : { "type" : "integer", "index": 
"not_analyzed" }
}
}
}
}'

The values are saved properly but unexpectedly Kibana does not draw the 
line i would expect, instead it joins between point that do not exist.

Following is the Kibana sreenshot:

[image: enter image description here]

The line curve should always be increasing since my data set is always 
increasing, that i can prove by the following events as seen by kibana 
itself:

[image: enter image description here]

Could it be related to the data formatting I did?

Thx in advance.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/26eb4c3b-a821-4fdf-ac42-90db0cdcb8e5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Completion Suggestor- Multi value field in a single document

2014-03-10 Thread vineeth mohan
Hello ,

Here its told how to index data for completion suggester -
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-suggesters-completion.html#search-suggesters-completion

But then it assumes that there is only one value for the field suggest.
What happens if there are multiple values for the field suggest.
How can we give separate input/output for multi values in the same field.

Thanks
  Vineeth

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAGdPd5n%2Br%3DYer9NAuzcBYkAjFBRjU7XC0yWcGwR-UhDVk46Z%3Dw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Fields with same name mapped differently in different types

2014-03-10 Thread morninlark
I also tried it with
{
  "query": {
"match_all": {}
  },
  "facets": {
"t2.stats": {
  "statistical": {
"field": "t2.sample_field"
  }
}
  }
}
but that results in the same error message.



--
View this message in context: 
http://elasticsearch-users.115913.n3.nabble.com/Fields-with-same-name-mapped-differently-in-different-types-tp4051423p4051431.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1394462849181-4051431.post%40n3.nabble.com.
For more options, visit https://groups.google.com/d/optout.


help with setting up mappings (Perl)

2014-03-10 Thread Dominic Nicholas
Hi,

I'm using the Elasticsearch Perl module and need guidance on setting up 
mappings.
I'm using the bulk() method to index data. Here is an example of the 
structure of the data :

  $response = $e->bulk(
 "index" : "idx-2014.03.10",
 "type" : "my_type",
 "body" : [
{
"index" : {
"_index" : "idx-2014.03.10",
"_id" : "4410",
"_type" : "my_type"
}
},
{
"something" : "interesting",
"somethingelse" : "also interesting"
},
{
"index" : {
"_index" : "idx-2014.03.10",
"_id" : "4411",
"_type" : "my_type"
}
},
{
"something" : "very interesting",
"somethingelse" : "not interesting"
}
 ]
  );

How do I set up mappings on various fields in the above example for 
'something' and 'somethingelse' fields ?
Also, how do I turn off the analyzer for an index (index: not_analyzed) 
 too ?

I know there are several ways of setting up mappings such as :

- when creating an index 
- by using the dedicated update mapping api
- using index templates

Ideally I'd like to use the dedicated update mapping api but am unclear how 
to use that through the Perl library interface (eg use 
transport->perform_request() 
?).

Thanks for any guidance and help.

Dom

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/f310961e-c389-49ea-82c9-c47d71d32209%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: TransportClient timeout / webserver configuration - JAVA Api

2014-03-10 Thread Thomas
As per the documentation Client is threadsafe and is suggested by the 
elasticsearch team to use the same client instance across your app. 
Considering your exception above you might need to look your configuration 
first (like cluster name and host/port) and you should use port 9300 for 
the Java API. Finally, check whether you are under a firewall or something 
and you block 9300 port.

Hope it helps
Thomas

On Monday, 10 March 2014 15:42:25 UTC+2, Robert Langenfeld wrote:
>
> Hello,
>
> I'm developing a tomcat webserver application that uses ElasticSearch 1.0 
> (Java API). There is a client facing desktop application that communicates 
> with the server so all the code for ElasticSearch is on that one instance 
> and it is used by all our clients. With that being said I am running into 
> this issue: After initializing a new TransportClient object and performing 
> some operation on it, there is a chance that i could sit idle for a very 
> long time. When does sit idle for a long time it gets this error:
>
>
> Mar 08, 2014 1:15:37 AM org.elasticsearch.client.transport
>
> INFO: [Elven] failed to get node info for 
> [#transport#-1][WIN7-113-00726][inet[/159.140.213.87:9300]], 
> disconnecting...
>
> org.elasticsearch.transport.RemoteTransportException: 
> [Server_Dev1][inet[/159.140.213.87:9300]][cluster/nodes/info]
>
> Caused by: java.lang.NullPointerException
>
> at org.elasticsearch.http.HttpInfo.writeTo(HttpInfo.java:82)
>
> at 
> org.elasticsearch.action.admin.cluster.node.info.NodeInfo.writeTo(NodeInfo.java:301)
>
> at 
> org.elasticsearch.action.admin.cluster.node.info.NodesInfoResponse.writeTo(NodesInfoResponse.java:63)
>
> at 
> org.elasticsearch.transport.netty.NettyTransportChannel.sendResponse(NettyTransportChannel.java:83)
>
> at 
> org.elasticsearch.action.support.nodes.TransportNodesOperationAction$TransportHandler$1.onResponse(TransportNodesOperationAction.java:244)
>
> at 
> org.elasticsearch.action.support.nodes.TransportNodesOperationAction$TransportHandler$1.onResponse(TransportNodesOperationAction.java:239)
>
> at 
> org.elasticsearch.action.support.nodes.TransportNodesOperationAction$AsyncAction.finishHim(TransportNodesOperationAction.java:225)
>
> at 
> org.elasticsearch.action.support.nodes.TransportNodesOperationAction$AsyncAction.onOperation(TransportNodesOperationAction.java:200)
>
> at 
> org.elasticsearch.action.support.nodes.TransportNodesOperationAction$AsyncAction.access$900(TransportNodesOperationAction.java:102)
>
> at 
> org.elasticsearch.action.support.nodes.TransportNodesOperationAction$AsyncAction$2.run(TransportNodesOperationAction.java:146)
>
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>
> at java.lang.Thread.run(Thread.java:744)
>
> Is there any way to prevent this from happening? I know the ideal 
> situation would be that after every request the transport client is closed. 
> But since it lives on a webserver with lots of search requests coming in, 
> we would ideally like it to stay open because it takes 3-4 seconds for a 
> transport client to initialize and we are going for speed here.
>
> Also since we are having one central server to handle all search and index 
> requests, can the TransportClient handle multiple simultaneous requests 
> from different users at the same time? We just want to make sure that we 
> are doing this correctly.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/d625d553-2ed5-456a-9180-7b423874b43e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


TransportClient timeout / webserver configuration - JAVA Api

2014-03-10 Thread Robert Langenfeld
Hello,

I'm developing a tomcat webserver application that uses ElasticSearch 1.0 
(Java API). There is a client facing desktop application that communicates 
with the server so all the code for ElasticSearch is on that one instance 
and it is used by all our clients. With that being said I am running into 
this issue: After initializing a new TransportClient object and performing 
some operation on it, there is a chance that i could sit idle for a very 
long time. When does sit idle for a long time it gets this error:


Mar 08, 2014 1:15:37 AM org.elasticsearch.client.transport

INFO: [Elven] failed to get node info for 
[#transport#-1][WIN7-113-00726][inet[/159.140.213.87:9300]], 
disconnecting...

org.elasticsearch.transport.RemoteTransportException: 
[Server_Dev1][inet[/159.140.213.87:9300]][cluster/nodes/info]

Caused by: java.lang.NullPointerException

at org.elasticsearch.http.HttpInfo.writeTo(HttpInfo.java:82)

at 
org.elasticsearch.action.admin.cluster.node.info.NodeInfo.writeTo(NodeInfo.java:301)

at 
org.elasticsearch.action.admin.cluster.node.info.NodesInfoResponse.writeTo(NodesInfoResponse.java:63)

at 
org.elasticsearch.transport.netty.NettyTransportChannel.sendResponse(NettyTransportChannel.java:83)

at 
org.elasticsearch.action.support.nodes.TransportNodesOperationAction$TransportHandler$1.onResponse(TransportNodesOperationAction.java:244)

at 
org.elasticsearch.action.support.nodes.TransportNodesOperationAction$TransportHandler$1.onResponse(TransportNodesOperationAction.java:239)

at 
org.elasticsearch.action.support.nodes.TransportNodesOperationAction$AsyncAction.finishHim(TransportNodesOperationAction.java:225)

at 
org.elasticsearch.action.support.nodes.TransportNodesOperationAction$AsyncAction.onOperation(TransportNodesOperationAction.java:200)

at 
org.elasticsearch.action.support.nodes.TransportNodesOperationAction$AsyncAction.access$900(TransportNodesOperationAction.java:102)

at 
org.elasticsearch.action.support.nodes.TransportNodesOperationAction$AsyncAction$2.run(TransportNodesOperationAction.java:146)

at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

at java.lang.Thread.run(Thread.java:744)

Is there any way to prevent this from happening? I know the ideal situation 
would be that after every request the transport client is closed. But since 
it lives on a webserver with lots of search requests coming in, we would 
ideally like it to stay open because it takes 3-4 seconds for a transport 
client to initialize and we are going for speed here.

Also since we are having one central server to handle all search and index 
requests, can the TransportClient handle multiple simultaneous requests 
from different users at the same time? We just want to make sure that we 
are doing this correctly.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/cb489c01-1e1a-400f-8d46-95db9ba782e3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Unable to load script under config/scripts

2014-03-10 Thread Thomas
Hi,

I'm trying to keep some scripts within config/scripts but elasticsearch 
seems that it cannot locate them. What could be a possible reason for this?

When need to invoke it es fails with the following 

No such property:  for class: Script1

Any ideas?

Thanks

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/f96b90e4-7704-49e0-8dd6-38ef1ebe6558%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[ANN] Elasticsearch File System River Plugin 1.0.0 released

2014-03-10 Thread David Pilato

Heya,


I am pleased to announce the release of the Elasticsearch File System River 
Plugin, version 1.0.0.

FS River Plugin offers a simple way to index local files into elasticsearch..

https://github.com/dadoonet/fsriver/

Release Notes - fsriver - Version 1.0.0



Update:
 * [48] - Update to elasticsearch 1.0.0 
(https://github.com/dadoonet/fsriver/issues/48)




Issues, Pull requests, Feature requests are warmly welcome on fsriver project 
repository: https://github.com/dadoonet/fsriver/
For questions or comments around this plugin, feel free to use elasticsearch 
mailing list: https://groups.google.com/forum/#!forum/elasticsearch

Enjoy,

-David

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/20140310130217.22D6CA626F%40smtp3-g21.free.fr.
For more options, visit https://groups.google.com/d/optout.


[ANN] Elasticsearch File System River Plugin 0.5.0 released

2014-03-10 Thread David Pilato

Heya,


I am pleased to announce the release of the Elasticsearch File System River 
Plugin, version 0.5.0.

FS River Plugin offers a simple way to index local files into elasticsearch..

https://github.com/dadoonet/fsriver/

Release Notes - fsriver - Version 0.5.0


Fix:
 * [53] - file.filename should be not_analyzed 
(https://github.com/dadoonet/fsriver/issues/53)

Update:
 * [47] - Move tests to elasticsearch test framework 
(https://github.com/dadoonet/fsriver/issues/47)

New:
 * [54] - Add plugin release semi-automatic script 
(https://github.com/dadoonet/fsriver/issues/54)
 * [50] - Add SSH port setting (https://github.com/dadoonet/fsriver/issues/50)

Doc:
 * [43] - Reformat code (use spaces instead of tabs) 
(https://github.com/dadoonet/fsriver/issues/43)
 * [42] - Add exhaustive list of all parameters 
(https://github.com/dadoonet/fsriver/issues/42)


Issues, Pull requests, Feature requests are warmly welcome on fsriver project 
repository: https://github.com/dadoonet/fsriver/
For questions or comments around this plugin, feel free to use elasticsearch 
mailing list: https://groups.google.com/forum/#!forum/elasticsearch

Enjoy,

-David

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/20140310125453.73F7682261%40smtp6-g21.free.fr.
For more options, visit https://groups.google.com/d/optout.


Fields with same name mapped differently in different types

2014-03-10 Thread morninlark
Hey there,

I have one index containing various types. It does happen, that a field with
the same name is mapped as string in one type and as integer in another
type.

The following query on a type t1, where "sample_field" is mapped as string
{
  "query": {
"match_all": {}
  },
  "facets": {
"stats": {
  "statistical": {
"field": "sample_field"
  }
}
  }
}

results in:
=> FacetPhaseExecutionException[Facet [stats]: field [sample_field] isn't a
number field, but a string]
which is of course the expected feedback

The same query a type t2, where "sample_field" is mapped as integer, leads
to the following error:
=> NumberFormatException[Invalid shift value in prefixCoded bytes (is
encoded value really an INT?)]

I know I could either rename the fields or put the types in different
indexes to avoid this error.
I want to keep the mappings as they are.
Is there a more elegant way to solve this issue? 

I am using elasticsearch version 0.90.7

Thank you!



--
View this message in context: 
http://elasticsearch-users.115913.n3.nabble.com/Fields-with-same-name-mapped-differently-in-different-types-tp4051423.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1394455870429-4051423.post%40n3.nabble.com.
For more options, visit https://groups.google.com/d/optout.


Re: DELETE snapshot request (which was long-running and had not yet completed) is hung

2014-03-10 Thread Igor Motov
That's strange. Wrong S3 permissions should have caused it to failed 
immediately. Could you provide any more details about the permissions, so I 
can reproduce it? Meanwhile, restarting the nodes where primary shards of 
the stuck index are located is the only option that I can think of. 

We are working on improving the performance of snapshot cancelation 
(https://github.com/elasticsearch/elasticsearch/pull/5244), but it didn't 
make it to a release yet.

On Monday, March 10, 2014 4:39:24 AM UTC-4, Swaroop wrote:
>
> Hi, 
>
> I had started a snapshot request on a freshly-indexed ES 1.0.1 cluster 
> with cloud plugin installed, but unfortunately the EC2 access keys 
> configured did not have S3 permissions, so ES was in a weird state, so I 
> sent a DELETE snapshot request and it's stuck for more than a couple of 
> hours, any advice on what to do here to cleanup the snapshot request? Logs 
> don't reveal anything relevant. 
>
> Regards, 
> Swaroop 
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c9ef80ee-5512-44bd-b301-54496a31f4b6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Error "array index out of bounds java.lang.OutOfMemoryError: Java heap space"

2014-03-10 Thread prashy
I am using Red Hat.

[root@localhost /]# lsb_release
LSB Version:   
:core-3.1-amd64:core-3.1-ia32:core-3.1-noarch:graphics-3.1-amd64:graphics-3.1-ia32:graphics-3.1-noarch
[root@localhost /]# lsb_release -a
LSB Version:   
:core-3.1-amd64:core-3.1-ia32:core-3.1-noarch:graphics-3.1-amd64:graphics-3.1-ia32:graphics-3.1-noarch
Distributor ID: RedHatEnterpriseServer
Description:Red Hat Enterprise Linux Server release 5.5 (Tikanga)
Release:5.5
Codename:   Tikanga




--
View this message in context: 
http://elasticsearch-users.115913.n3.nabble.com/Error-array-index-out-of-bounds-java-lang-OutOfMemoryError-Java-heap-space-tp4050914p4051420.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1394449376908-4051420.post%40n3.nabble.com.
For more options, visit https://groups.google.com/d/optout.


Re: Error "array index out of bounds java.lang.OutOfMemoryError: Java heap space"

2014-03-10 Thread Clinton Gormley
Also, are you using Ubuntu 10.04?  I see you have slow young generation GC,
and that version of Ubuntu had a bug in that area.


On 10 March 2014 10:24, prashy  wrote:

> Hi gkwelding,
>
> I have checked explicitly on my box and the value for MAX_OPEN_FILES and
> MAX_MAP_COUNT has been set to 65535.
>
>
>
> --
> View this message in context:
> http://elasticsearch-users.115913.n3.nabble.com/Error-array-index-out-of-bounds-java-lang-OutOfMemoryError-Java-heap-space-tp4050914p4051415.html
> Sent from the ElasticSearch Users mailing list archive at Nabble.com.
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/1394443468998-4051415.post%40n3.nabble.com
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAPt3XKTs3wz84fG4sV1f170J%3DUZAiJULdhUc8RXGVPGBKfqQoQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: elasticsearch is taking more memory than allocated heap when using bootstrap.mlockall: true

2014-03-10 Thread joergpra...@gmail.com
A Java JVM process takes more memory than just the heap. There are shared
libraries / classes, stacks, buffers, etc. that add to the heap memory.

If you want to analyze the process memory, you can examine the process map
on Linux with pmap -x 

Jörg

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGsmFzwn%2BCmHY_eqhUKy%3DpEP9gv8No1%3DCdUje%2BZGAYi%3DQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Aggregations and facets - different results

2014-03-10 Thread Adrien Grand
Hi,

As you can see, on the other hand, aggregations give counts for "4", "5",
"6", "long" and "name" while facets don't. This is due to term selection:
by default aggregations only return the top 10 terms (configurable through
the `size` parameter) and those top terms are sorted by count desc, then
term asc. This is the reason why you didn't get "thing" or "thingy" that
would have been at positions higher than 10. If you want  to get counts for
more tags, you could consider increasing the size of your terms aggregation.


On Sat, Mar 8, 2014 at 3:26 PM, Marek Obuchowicz <
marek.obuchow...@project-a.com> wrote:

> Hi,
>
> I've been using aggregations (terms) instead of facets as I used to do
> before (Elasticsearch 1.01 on OS-X).
> After working perfectly fine for a while, I found out that I stopped
> getting new data in the aggregated queries. The unaggregated results look
> fine, data is fresh. Have tried _refresh and restarting elasticsearch
> (single shard, no replicas) as well, didn't help.
>
> The results can be seen on this gist:
> https://gist.github.com/marek-obuchowicz/9431175
>
> As you can see, the aggregation doesn't show, for example, tags "thing" or
> "thingy", while facets do.
>
> Any tips? Or, "just keep using facets"? :)
>
> Best regards,
>   Marek Obuchowicz
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/0a455cdd-e307-48af-b927-79839430a490%40googlegroups.com
> .
> For more options, visit https://groups.google.com/d/optout.
>



-- 
Adrien Grand

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j4g1Nuv4E6QkynCopOe3HyaEAaG-C1PqpAKNAhpnbawYw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Error "array index out of bounds java.lang.OutOfMemoryError: Java heap space"

2014-03-10 Thread prashy
Hi gkwelding,

I have checked explicitly on my box and the value for MAX_OPEN_FILES and
MAX_MAP_COUNT has been set to 65535.



--
View this message in context: 
http://elasticsearch-users.115913.n3.nabble.com/Error-array-index-out-of-bounds-java-lang-OutOfMemoryError-Java-heap-space-tp4050914p4051415.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1394443468998-4051415.post%40n3.nabble.com.
For more options, visit https://groups.google.com/d/optout.


elasticsearch is taking more memory than allocated heap when using bootstrap.mlockall: true

2014-03-10 Thread Subhadip Bagui


Hi,

I’ve allocated max and min heap same with *ES_HEAP_SIZE=1200m* in 
elasticsearch and using *bootstrap.mlockall: true *as suggested by 
elasticsearch so that process memory won’t get swapped.

But when elasticsearch I start its taking more memory than max heap 
mentioned; like 1.4g and holding the same memory. The usage is not getting 
fluctuating now, which is a good thing. 

Can you please explain why its taking more memory for JVM than allocated.

--Subhadip


-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/98cfaf5a-4559-4b7e-80ac-816626690e49%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: fielddata breaker question

2014-03-10 Thread Dunaeth
I'm asking our hoster to monitor these metrics and to avoid any confusion, 
the breaker indice size actually monitor the 
fielddata_breaker.estimated_size_in_bytes from the /_nodes/stats endpoint. 
Thanks for following this thread :)

Le lundi 10 mars 2014 09:34:15 UTC+1, Martijn v Groningen a écrit :
>
> Yes, the breaker indices size does grow quickly. Can you share the same 
> graphs for jvm heap used and field data size? 
>
>
> On 10 March 2014 15:16, Dunaeth > wrote:
>
>> Hi,
>>
>> In order to be more precise, here are the graphs of the metrics we 
>> monitor since we've had the fielddata breaker issue :
>>
>>
>> 
>>
>>
>> 
>>
>>
>> 
>>
>> As one can see, the indices grow kind of linearly with a size which 
>> remains relatively small when the fielddata breaker estimated size grows 
>> exponentially.
>>
>> Le jeudi 6 mars 2014 14:49:04 UTC+1, Dunaeth a écrit :
>>
>>> At the moment, we have a whole index size of less than 100MB (less than 
>>> 200MB with backuped data) and the estimated_size is 1.4GB... How are we 
>>> supposed to deal we that kind of trouble ?
>>>
>>> Le mardi 4 mars 2014 06:50:56 UTC+1, Dunaeth a écrit :

 Isn't it a bit weird that we reached a 800MB limit and shortcircuited 
 the data processing when our whole indices size is only 140MB (half this 
 size actually since it includes a backup node) ?
>>>
>>>  -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/73526a94-b16b-48b6-9a27-526b194b145f%40googlegroups.com
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
>
> -- 
> Met vriendelijke groet,
>
> Martijn van Groningen 
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/4e171de7-8bd7-4cb4-8a83-989d7dddee21%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


DELETE snapshot request (which was long-running and had not yet completed) is hung

2014-03-10 Thread Swaroop CH
Hi,

I had started a snapshot request on a freshly-indexed ES 1.0.1 cluster with 
cloud plugin installed, but unfortunately the EC2 access keys configured did 
not have S3 permissions, so ES was in a weird state, so I sent a DELETE 
snapshot request and it's stuck for more than a couple of hours, any advice on 
what to do here to cleanup the snapshot request? Logs don't reveal anything 
relevant.

Regards,
Swaroop

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/531801394440764%40web11j.yandex.ru.
For more options, visit https://groups.google.com/d/optout.


Re: fielddata breaker question

2014-03-10 Thread Martijn v Groningen
Yes, the breaker indices size does grow quickly. Can you share the same
graphs for jvm heap used and field data size?


On 10 March 2014 15:16, Dunaeth  wrote:

> Hi,
>
> In order to be more precise, here are the graphs of the metrics we monitor
> since we've had the fielddata breaker issue :
>
>
> 
>
>
> 
>
>
> 
>
> As one can see, the indices grow kind of linearly with a size which
> remains relatively small when the fielddata breaker estimated size grows
> exponentially.
>
> Le jeudi 6 mars 2014 14:49:04 UTC+1, Dunaeth a écrit :
>
>> At the moment, we have a whole index size of less than 100MB (less than
>> 200MB with backuped data) and the estimated_size is 1.4GB... How are we
>> supposed to deal we that kind of trouble ?
>>
>> Le mardi 4 mars 2014 06:50:56 UTC+1, Dunaeth a écrit :
>>>
>>> Isn't it a bit weird that we reached a 800MB limit and shortcircuited
>>> the data processing when our whole indices size is only 140MB (half this
>>> size actually since it includes a backup node) ?
>>
>>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/73526a94-b16b-48b6-9a27-526b194b145f%40googlegroups.com
> .
> For more options, visit https://groups.google.com/d/optout.
>



-- 
Met vriendelijke groet,

Martijn van Groningen

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CA%2BA76TzgRF1Fo9dshF33WmyoJ5Cu6hO_rg4BxeUuwXARM3Dagg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Marvel strategy to data store

2014-03-10 Thread David Pilato
You can not delete from Marvel yet.

But you can run

curl -XDELETE http://localhost:9200/.marvel- 

or use curator:

https://github.com/elasticsearch/curator

-- 
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr


Le 10 mars 2014 à 09:27:25, Shilpi Agrawal (shil...@strumsoft.com) a écrit:

How can we delete the data from marvel.
some Instruction or commands would be helpful.

Thanks

On Monday, March 10, 2014 12:13:22 PM UTC+5:30, Shilpi Agrawal wrote:
Hi,

I want to know the strategy which marvel follows to store the data.
Like for how long it store the data and how it flushes the data and how much 
data can be stored any limitations.

how it manages the data which is stored via any cluster.

Please help me to let me know this.

Thanks
--
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/b2d3c23d-21aa-4ffc-a875-befdb89b2f14%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/etPan.531d77e0.238e1f29.b095%40MacBook-Air-de-David.local.
For more options, visit https://groups.google.com/d/optout.


Re: Marvel strategy to data store

2014-03-10 Thread Shilpi Agrawal
How can we delete the data from marvel.
some Instruction or commands would be helpful.

Thanks

On Monday, March 10, 2014 12:13:22 PM UTC+5:30, Shilpi Agrawal wrote:
>
> Hi,
>
> I want to know the strategy which marvel follows to store the data.
> Like for how long it store the data and how it flushes the data and how 
> much data can be stored any limitations.
>
> how it manages the data which is stored via any cluster.
>
> Please help me to let me know this.
>
> Thanks
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/b2d3c23d-21aa-4ffc-a875-befdb89b2f14%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Parse Failure [Failed to parse source...] - while doing string search in elasticsearch

2014-03-10 Thread Subhadip Bagui
Thanks Binh. It's working now...

On Thursday, March 6, 2014 1:56:26 PM UTC+5:30, Subhadip Bagui wrote:
>
> Hi,
>
> I'm new in elasticsearch and need some help. I'm getting below exception 
> while trying to search a string in elasticsearch using Kibana. Below are 
> the components I'm using.
> 1. logstash 1.3.3
> 2. redis : 2.4.10
> 3. elasticsearch: 1.0
> 4.kibana
>
> I'm running elasticsearch with default, single cluster and 1 node. Please 
> help to resolve this issue.
> --
> elasticsearch conf:
> ##
> bootstrap.mlockall: true
> indices.cache.filter.size: 20%
> index.cache.field.max_size: 5
> index.cache.field.expire: 10m
> indices.fielddata.cache.size: 30%
> indices.fielddata.cache.expire: -1
> indices.memory.index_buffer_size: 50%
> index.refresh_interval: 30s
> index.translog.flush_threshold_ops: 5
> index.store.compress.stored: true
>
> threadpool.search.type: fixed
> threadpool.search.size: 20
> threadpool.search.queue_size: 100
>
> threadpool.index.type: fixed
> threadpool.index.size: 30
> threadpool.index.queue_size: 200
>
> threadpool.bulk.type: fixed
> threadpool.bulk.size: 60
> threadpool.bulk.queue_size: 300
> ##
>
> 
> [2014-03-06 13:27:08,235][DEBUG][action.search.type   ] [Katherine 
> "Kitty" Pryde] [logstash-2014.03.06][1], node[QUrNRW5cSi6IS3P_BCZkHw], [P], 
> s[STARTED]: Failed to execute 
> [org.elasticsearch.action.search.SearchRequest@1c03a4f] lastShard [true]
> org.elasticsearch.search.SearchParseException: [logstash-2014.03.06][1]: 
> from[-1],size[-1]: Parse Failure [Failed to parse source 
> [{"query":{"filtered":{"query":{"bool":{"should":[{"query_string":{"query":"/
> ec2.ap-southeast-1.amazonaws.com
> "}}]}},"filter":{"bool":{"must":[{"match_all":{}},{"range":{"@timestamp":{"from":1394006201505,"to":"now"}}},{"bool":{"must":[{"match_all":{}}]}}],"highlight":{"fields":{},"fragment_size":2147483647,"pre_tags":["@start-highlight@"],"post_tags":["@end-highlight@"]},"size":2,"sort":[{"@timestamp":{"order":"desc"}}]}]]
> at 
> org.elasticsearch.search.SearchService.parseSource(SearchService.java:586)
> at 
> org.elasticsearch.search.SearchService.createContext(SearchService.java:489)
> at 
> org.elasticsearch.search.SearchService.createContext(SearchService.java:474)
> at 
> org.elasticsearch.search.SearchService.createAndPutContext(SearchService.java:467)
> at 
> org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:239)
> at 
> org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:202)
> at 
> org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryThenFetchAction.java:80)
> at 
> org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:216)
> at 
> org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:203)
> at 
> org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$2.run(TransportSearchTypeAction.java:186)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
> at java.lang.Thread.run(Thread.java:662)
> Caused by: org.elasticsearch.index.query.QueryParsingException: 
> [logstash-2014.03.06] Failed to parse query [/
> ec2.ap-southeast-1.amazonaws.com]
> at 
> org.elasticsearch.index.query.QueryStringQueryParser.parse(QueryStringQueryParser.java:235)
> at 
> org.elasticsearch.index.query.QueryParseContext.parseInnerQuery(QueryParseContext.java:223)
> at 
> org.elasticsearch.index.query.BoolQueryParser.parse(BoolQueryParser.java:107)
> at 
> org.elasticsearch.index.query.QueryParseContext.parseInnerQuery(QueryParseContext.java:223)
> at 
> org.elasticsearch.index.query.FilteredQueryParser.parse(FilteredQueryParser.java:71)
> at 
> org.elasticsearch.index.query.QueryParseContext.parseInnerQuery(QueryParseContext.java:223)
> at 
> org.elasticsearch.index.query.IndexQueryParserService.parse(IndexQueryParserService.java:321)
> at 
> org.elasticsearch.index.query.IndexQueryParserService.parse(IndexQueryParserService.java:260)
> at 
> org.elasticsearch.search.query.QueryParseElement.parse(QueryParseElement.java:33)
> at 
> org.elasticsearch.search.SearchService.parseSource(SearchService.java:574)
> ... 12 more
> Caused by: org.apache.lucene.queryparser.classic.ParseException: Cannot 
> parse '/ec2.ap-southeast-1.amazonaws.com': Lexical error at line 1, 
> column 34.  Encountered:  after : "/ec2.ap-southeast-1.amazonaws.com"
> at 
> org.apache.lucene.queryparser.cl

Re: fielddata breaker question

2014-03-10 Thread Dunaeth
Hi,

In order to be more precise, here are the graphs of the metrics we monitor 
since we've had the fielddata breaker issue :







As one can see, the indices grow kind of linearly with a size which remains 
relatively small when the fielddata breaker estimated size grows 
exponentially.

Le jeudi 6 mars 2014 14:49:04 UTC+1, Dunaeth a écrit :
>
> At the moment, we have a whole index size of less than 100MB (less than 
> 200MB with backuped data) and the estimated_size is 1.4GB... How are we 
> supposed to deal we that kind of trouble ?
>
> Le mardi 4 mars 2014 06:50:56 UTC+1, Dunaeth a écrit :
>>
>> Isn't it a bit weird that we reached a 800MB limit and shortcircuited the 
>> data processing when our whole indices size is only 140MB (half this size 
>> actually since it includes a backup node) ?
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/73526a94-b16b-48b6-9a27-526b194b145f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Running out of memory when parsing the large text file.

2014-03-10 Thread Ivan Ji
Hi all,

This time, I read the string content out from the file and push it into a 
field inside a document, whose analyzer is standard.
Without the attachment mapper, the same condition occurred. It occur " 
java.lang.OutOfMemoryError: 
Java heap spac" when the total index is just 400MB and the document count 
is 10.

What's the suggestions for these large text file ? 

I am considering using more smart analyzer which might eliminate some 
redundancies. but are there any other ?

cheers,

Ivan

Ivan Ji於 2014年1月10日星期五UTC+8上午9時59分13秒寫道:
>
> Hi all,
>
> I post several large text files, which are about 20~30MB and contains all 
> the text, into ES. And I use the attachment mapper to be the field type to 
> store these file.
> It cost memory very much. Even when I post one file, the used memory grows 
> from about 150MB to 250MB. BTW, I use the default tokenizer for these field.
>
> Although this file can be generated many tokens, but what I don't 
> understand is the memory cost. Does it store all the tokens into memory?
>
> Ideas?
>
> Cheers,
>
> Ivan
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/75fa9adc-03c8-45a4-9360-c0926a7d0f1d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Marvel strategy to data store

2014-03-10 Thread David Pilato
It stores data forever so you basically need to remove old data after some days.
curator could help here.

In the future, you will have built in elasticsearch a feature which will do 
that. But by now, you need to take care of it yourself.

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs


Le 10 mars 2014 à 07:43, Shilpi Agrawal  a écrit :

Hi,

I want to know the strategy which marvel follows to store the data.
Like for how long it store the data and how it flushes the data and how much 
data can be stored any limitations.

how it manages the data which is stored via any cluster.

Please help me to let me know this.

Thanks
-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/f3ff7696-6d9b-44c9-9f98-63674c13e464%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/26F32A0F-C227-430B-A536-50E066C457FF%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.