pull only fields with given value

2015-03-17 Thread Adrian
I have a JSON like this for a document

_source: {
timestamp: 213234234,
links: [ {
  mention: "audi",
  entity: {rank:3, name:"some name"}
  }, {
  mention: "ford",
  entity: {rank:0, name:"some other name"},
  }
]
  }
}

I'm interested in retrieving only the mention and rank fields where 
rank==0. 
I am able to specify which fields I want using "fields" like this 
"fields":["timestamp","links.mention","links.entity.rank"] and I can even 
filter (query {filtered { query { term { links.entity.rank = 0  ) so 
that it returns only documents that have rank=0

Such a query returns all fields I mention and all the objects in the links 
array.

_source: {
timestamp: [213234234],
links.mention: [  "audi", "ford" ],
links.entity.rank: [ 3, 0 ]
  }
}

I don't want to have 3 in links.entity.rank. Is there a way to re filter 
the result of a query?

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/e9cdf199-f8f0-4fcc-8642-ef7a1f4eab58%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Terms facet not "searching" all indices

2015-02-17 Thread Adrian Howchin
Hi all,

This may be a newbie question, so please forgive me. I've searched the docs 
but can't find anything to answer this. 

I have Kibana 3.1. I have a number of different indexes inside 
elasticsearch, and I have a view of Kibana configured to look at indexes A, 
B and C. 

I can search and see logs from them fine, but, for example, my "terms" 
facet does not show results from all 3. It only shows results from the 
first one "A". 

Is there a way to get the terms facet to show results from all 3 indexes? 
Am I simple missing something in my config?

Cheers all,
Adrian

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/aea978dc-cc66-40e4-8ae2-e258e0c1d347%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Fielddata circuit breaker problems

2014-10-02 Thread Adrian Luna
When creating the index, specify a mapping and use field_data. Take the 
example from the documentation and embed it into a normal mapping 
definition. However, I realised that it'll only work with 
string/numeric/geo_point, so you may need to use timestamp for your time 
field.

Hope it helps.

El jueves, 2 de octubre de 2014 03:29:30 UTC+2, Dave Galbraith escribió:
>
> Hi! So I have millions and millions of documents in my Elasticsearch, each 
> one of which has a field called "time". I need the results of my queries to 
> come back in chronological order. So I put a 
> "sort":{"time":{"order":"asc"}} in all my queries. This was going great 
> on smaller data sets but then Elasticsearch started sending me 500s and 
> circuit breaker exceptions started showing up in the logs with "data for 
> field time would be too large". So I checked out 
> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/index-modules-fielddata.html
>  
> and that looks a lot like what I've been seeing: seems like it's trying to 
> pull all the millions of time values into memory even if they're not 
> relevant to my query. What are my options for fixing this? I can't 
> compromise chronological order, it's at the heart of my application. "More 
> memory" would be a short-term fix but the idea is to scale this thing to 
> trillions and trillions of points and that's a race I don't want to run. 
> Can I make these exceptions go away without totally tanking performance? 
> Thanks!
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/8ee0351d-21a7-4fd8-9082-d9eb0b5238d0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Upgrading from very old version of ES with zero down time

2014-10-02 Thread Adrian Luna
I would just start another node (or a couple of nodes for security reasons, 
mantaining replicas in different machines, etc.) in your cluster with the 
new ES version, wait untill data gets copied, turn off every deprecated 
(old) version of ES and then restart the nodes you turned-off with after 
updating its version.

Alternatively, you can also do something similar to this:
http://www.elasticsearch.org/blog/changing-mapping-with-zero-downtime/

Hope some of this helps.

El jueves, 2 de octubre de 2014 02:28:02 UTC+2, Eugene Strokin escribió:
>
> Hello,
> my ES cluster is still running version 0.20.1. It is time to upgrade. I 
> know I cannot just use indexes as is and replace the jars by the newest ES. 
> They are not compatible as far as I understood.
> So I need to set up a parallel cluster with the newest ES and some how 
> transfer all the data with zero down time. The size of the indexes is about 
> 100Gb and the traffic is relatively big, so it could take some time, and 
> somehow I need to keep the clusters in sync.
> Did someone had such experience?
> Does someone have any suggestions how to approach this? 
> I cannot come up with some elegant solution.
> Any help is greatly appreciated.
> Thank you,
> Eugene
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/bf27a962-87cd-42aa-b47b-1ce9520307a7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Any General approaches on resolving FieldData CircuitBreakingExceptions

2014-10-02 Thread Adrian Luna
Will setting the fielddata format as doc_values help?

El jueves, 2 de octubre de 2014 11:28:50 UTC+2, dayo.o...@gmail.com 
escribió:
>
> Hi,
>
> I've recently encountered the following CircuitBreakingException 
>
> [2014-09-16 11:03:28,698][ERROR][indices.fielddata.breaker] [Master Khan] 
> New used memory 640211624 [610.5mb] from field [url] would be larger than 
> configured breaker: 639015321 [609.4mb], breaking
> [2014-09-16 11:03:28,698][DEBUG][action.search.type   ] [Master Khan] 
> [events_v2][4], node[N5VujlU7R0mr2aC9wdzOIw], [R], s[STARTED]: Failed to 
> execute [org.elasticsearch.action.search.SearchRequest@3814fd0c] lastShard 
> [true]
> org.elasticsearch.search.query.QueryPhaseExecutionException: 
> [events_v2][4]: query[ConstantScore(*:*)],from[0],size[0]: Query Failed 
> [Failed to execute main query]
> at 
> org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:162)
> at 
> org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:261)
> at 
> org.elasticsearch.search.action.SearchServiceTransportAction$5.call(SearchServiceTransportAction.java:206)
> at 
> org.elasticsearch.search.action.SearchServiceTransportAction$5.call(SearchServiceTransportAction.java:203)
> at 
> org.elasticsearch.search.action.SearchServiceTransportAction$23.run(SearchServiceTransportAction.java:517)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: org.elasticsearch.ElasticsearchException: 
> org.elasticsearch.common.breaker.CircuitBreakingException: Data too large, 
> data for field [url] would be larger than limit of [639015321/609.4mb]
> at 
> org.elasticsearch.index.fielddata.AbstractIndexFieldData.load(AbstractIndexFieldData.java:79)
> at 
> org.elasticsearch.index.fielddata.plain.AbstractBytesIndexFieldData.load(AbstractBytesIndexFieldData.java:41)
>
> The nodes were restarted with more memory, but obviously this isn't a long 
> term solution.
> My understanding of the above Exception is that a query ran, that resulted 
> in the Fielddata cache's limit being breached.
>
> My current plan is to try and find out which query is causing this by 
> monitoring the field data stats for the different queries,
> and then to possibly look into either using fielddata filtering or doc 
> values in the mappings.
>
> Do you guys have any other advice/suggestions/guidelines in dealing with 
> this sort of isssue?
>
> Thanks
> Dayo
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c7992955-ca0f-4135-b90a-80a3a73682e1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Fielddata circuit breaker problems

2014-10-02 Thread Adrian Luna
Have you tried indexing your data using "doc_values" as your fielddata 
format?

El jueves, 2 de octubre de 2014 03:29:30 UTC+2, Dave Galbraith escribió:
>
> Hi! So I have millions and millions of documents in my Elasticsearch, each 
> one of which has a field called "time". I need the results of my queries to 
> come back in chronological order. So I put a 
> "sort":{"time":{"order":"asc"}} in all my queries. This was going great 
> on smaller data sets but then Elasticsearch started sending me 500s and 
> circuit breaker exceptions started showing up in the logs with "data for 
> field time would be too large". So I checked out 
> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/index-modules-fielddata.html
>  
> and that looks a lot like what I've been seeing: seems like it's trying to 
> pull all the millions of time values into memory even if they're not 
> relevant to my query. What are my options for fixing this? I can't 
> compromise chronological order, it's at the heart of my application. "More 
> memory" would be a short-term fix but the idea is to scale this thing to 
> trillions and trillions of points and that's a race I don't want to run. 
> Can I make these exceptions go away without totally tanking performance? 
> Thanks!
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c2ed3bac-93f7-49ad-aea9-01005aacc1ed%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Setting up default mapping Elasticsearch

2014-10-02 Thread Adrian Luna
i've tried it with 1.1.X, 1.2,X and 1.3.X.
1.1.X won't give an error because it doesn't make use of the default 
mapping.
In the other versions, the default mapping is not working at all.

Thanks, I'll use the index templates since it works perfectly. 

El jueves, 2 de octubre de 2014 11:42:49 UTC+2, Tanguy Leroux escribió:
>
> Using index templates is definitely the way to go :)
>
> Which version of ES are you using?
>
> -- Tanguy
>
> Le jeudi 2 octobre 2014 11:19:39 UTC+2, Adrian Luna a écrit :
>>
>> Ok , I think I solved the issue using index templates. However, I still 
>> feel like there's something missing here.. so I would appreciate any 
>> further information...
>>
>> Cheers
>>
>> El jueves, 2 de octubre de 2014 11:08:53 UTC+2, Adrian Luna escribió:
>>>
>>> From the documentation you can set up default mapping definition for 
>>> every index just putting the .json file inside 
>>> $ELASTICSEARCH_HOME/config/mappings/_default
>>>
>>> However, after doing this, I just get the error:
>>>
>>> MapperParsingException[mapping [default_mapping]]; nested: 
>>> MapperParsingException[Root type mapping not empty after parsing! 
>>>
>>> My mapping file looks like:
>>>
>>> 
>>>  {
>>>   "event" : {
>>> "dynamic_templates" : [
>>>   {
>>> "template_1" : {
>>>   "match" : "*",
>>>   "mapping" : {
>>> "type" : "string",
>>> "index": "not_analyzed"
>>>   }
>>> }
>>>   }
>>> ]
>>>   }
>>> }
>>>
>>>
>>>
>>> The problem is that I need to setup several things for each index I 
>>> create, but I don't want to care about updating the map per index, since I 
>>> would need to ask if the index exists before doing this. That means, during 
>>> my workflow I just want to index documents, do not want to care about 
>>> settings stuff.
>>>
>>> Thanks in advance!
>>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/7daf698a-2f2c-44c2-9b14-9fc2b4f26338%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Setting up default mapping Elasticsearch

2014-10-02 Thread Adrian Luna
Ok , I think I solved the issue using index templates. However, I still 
feel like there's something missing here.. so I would appreciate any 
further information...

Cheers

El jueves, 2 de octubre de 2014 11:08:53 UTC+2, Adrian Luna escribió:
>
> From the documentation you can set up default mapping definition for every 
> index just putting the .json file inside 
> $ELASTICSEARCH_HOME/config/mappings/_default
>
> However, after doing this, I just get the error:
>
> MapperParsingException[mapping [default_mapping]]; nested: 
> MapperParsingException[Root type mapping not empty after parsing! 
>
> My mapping file looks like:
>
> 
>  {
>   "event" : {
> "dynamic_templates" : [
>   {
> "template_1" : {
>   "match" : "*",
>   "mapping" : {
> "type" : "string",
> "index": "not_analyzed"
>   }
> }
>   }
> ]
>   }
> }
>
>
>
> The problem is that I need to setup several things for each index I 
> create, but I don't want to care about updating the map per index, since I 
> would need to ask if the index exists before doing this. That means, during 
> my workflow I just want to index documents, do not want to care about 
> settings stuff.
>
> Thanks in advance!
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/99f3e281-a3cc-47b8-aaae-5b5912a4ccd8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Setting up default mapping Elasticsearch

2014-10-02 Thread Adrian Luna
>From the documentation you can set up default mapping definition for every 
index just putting the .json file inside 
$ELASTICSEARCH_HOME/config/mappings/_default

However, after doing this, I just get the error:

MapperParsingException[mapping [default_mapping]]; nested: 
MapperParsingException[Root type mapping not empty after parsing! 

My mapping file looks like:


 {
  "event" : {
"dynamic_templates" : [
  {
"template_1" : {
  "match" : "*",
  "mapping" : {
"type" : "string",
"index": "not_analyzed"
  }
}
  }
]
  }
}



The problem is that I need to setup several things for each index I create, 
but I don't want to care about updating the map per index, since I would 
need to ask if the index exists before doing this. That means, during my 
workflow I just want to index documents, do not want to care about settings 
stuff.

Thanks in advance!

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/749e68d9-24c7-4f5f-aa69-4c60b4c2c766%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Jaro-Winkler Query

2014-08-14 Thread Adrian C
Hi,

Is it possible to use jaro-winker rather than levenshtein distance for 
fuzzy queries? Any ideas how one could go about extending ES to enable 
this. 

I have looked at using function score like below however this does not work 
for me seeing as I cannot combine this with another query that uses 
provided analyzers. For example I would like to combine a search using 
phonetic encoding & jaro-winker distance.

query: {
  "function_score" : {
"query" : {
  "match_all" : { }
},
"script_score" : {
  "script" : "daon-jaro-winkler",
  "lang" : "native",
  "params" : {
"fullname_untokenized" : {
  "query" : "some name"
}
  }
}
  }
}

Thanks, 
Adrian

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/bf73b9ac-1984-46d7-b8ff-c02b94653eeb%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: long String retrieved as empty; short String retrieved fine; why ES?

2014-07-23 Thread Adrian
Correction: I get somewhere around 220 characters in that String NOT 40-50 
as I originally mentioned.


On Wednesday, 23 July 2014 15:36:10 UTC-4, Adrian wrote:
>
> I use this script to inspect data in various docs from ES.
>
> {
>   "query": {
> "match_all": {}
>   },
>   "sort": {
> "_script": {
>   "script": "if(doc['site'].values.contains(12)){return 
> 'foo'}else{return doc['dataX'].values }",
>   "type": "string",
>   "order": "desc"
> }
>   }
> }
>
> The only important part of this script is that it will always go in the 
> else clause and print  doc['dataX'].values
>
> I also have this document JSON:
> {
> "doc":{
>   "site" : "gencoupe.com",
>   "name" : "amount-active-users",
>   "daily" : {
> "dataX": "1000,490,390,600,300",
> "dataY": "1388538061, 1388624461, 1388710861, 1388797261, 133661",
> "startDate":1388538061,
> "endDate":133661
>   }
>
> }
> }
>
> I've noticed that if the string dataX is longer than 40 someting 
> characters, the script I presented retrieves dataX as an empty array. If 
> the strnig is 40 chars or less it works fine and retrieves the contents of 
> the String.
>
> Is there a limit on how many char a String can store in ES? Why am I 
> seeing this inconsistent behaviour and how can I make it retrieve the 
> entire string, of potentially very large size?
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/045114fe-abfc-4144-8824-a6b312cd22cb%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


long String retrieved as empty; short String retrieved fine; why ES?

2014-07-23 Thread Adrian
I use this script to inspect data in various docs from ES.

{
  "query": {
"match_all": {}
  },
  "sort": {
"_script": {
  "script": "if(doc['site'].values.contains(12)){return 
'foo'}else{return doc['dataX'].values }",
  "type": "string",
  "order": "desc"
}
  }
}

The only important part of this script is that it will always go in the 
else clause and print  doc['dataX'].values

I also have this document JSON:
{
"doc":{
  "site" : "gencoupe.com",
  "name" : "amount-active-users",
  "daily" : {
"dataX": "1000,490,390,600,300",
"dataY": "1388538061, 1388624461, 1388710861, 1388797261, 133661",
"startDate":1388538061,
"endDate":133661
  }

}
}

I've noticed that if the string dataX is longer than 40 someting 
characters, the script I presented retrieves dataX as an empty array. If 
the strnig is 40 chars or less it works fine and retrieves the contents of 
the String.

Is there a limit on how many char a String can store in ES? Why am I seeing 
this inconsistent behaviour and how can I make it retrieve the entire 
string, of potentially very large size?

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/39a82d40-3381-4b1f-a5b4-96805b14e90a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


non existing scripts execute - what is going on

2014-07-23 Thread Adrian
My windows test machine where I have ES installed has restarted 
automatically several times during the day. I was testing some custom 
scripts in ES.

After restart I went back to continue my tests and realized that any script 
I told ES to run it would do the same thing and always give me results. 
Even when I used scripts which did not exist. Take this example:

{
  "query": {
"custom_score": {
  "query": {
"match_all": {}
  },
  "script": "sdgfhjgf",
  "params": {
"site": "rgf"
  },
  "lang": "native"
}
  }
}

This returns to me doc and with a score of 1!!! This script: sdgfhjgf, 
doesn't even exist yet ES returns results from it! What is going on?

PS - I tried this same JSON query on a different ES cluster and as expected 
I get an error.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/f2616992-1f30-4ef0-906a-d7dc3bd583f0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


How to figure out field type?

2014-07-17 Thread Adrian
I've added some data to my ES.

JSON format:

{
"doc":{
  "site" : "marriage.com",
  "name" : "amount-active-users",

  "daily" : {
"dataX": [1,2,3],
"dataY": [1388538061, 1388624461, 1388710861],
"startDate":1388538061,
"endDate":1388710861
  }

}
}

If you look at dataX field, it's an array. ES interprets it as an array on 
Longs.

Now when I add another JSON doc with dataX being doubles I'm not sure on 
the Java side how to know when the input is in format 
criptDocValues.Doubles or criptDocValues.Longs
I would like data to be interpreted as doubles all the time. 

This is the terrible code I use to case but it's not working after adding 
the doubles to the dataX field:

List dataXTimeSeries2Long= ((ScriptDocValues.Doubles) 
doc().get(rootPathDataX)).getValues();
List dataXTimeSeries2 = new ArrayList();
//reverse TS so that it matches order of series retrieved via 
client() - ya those are reversed
   for(int i=dataXTimeSeries2Long.size()-1; i>-1; i--){

dataXTimeSeries2.add(Double.parseDouble(dataXTimeSeries2Long.get(i).toString()));
 
//using toStr is slow
}

This code fails with a class cast exception:
ClassCastException[org.elasticsearch.index.fielddata.ScriptDocValues$Longs 
cannot be cast to 
org.elasticsearch.index.fielddata.ScriptDocValues$Doubles]; }]
I can do that in Scala. Java is behind.

Help!

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/12d62f00-b0c5-4552-b4fe-07a874085ce4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: native script in ES

2014-07-10 Thread Adrian
I understand what you mean regarding efficiency/speed. 
My reason for doing things this way was so that I could compare 2 time 
series, ones coming from 1 query and another coming from another query.

On Thursday, 10 July 2014 15:27:28 UTC-4, Jörg Prante wrote:
>
> Yes, this is a lookup from one document of another document using a get 
> request. Note the result of the lookup is a single document source, whereas 
> a search request provides a search response with a document source list, 
> which is vastly oversized for use from a script, and takes even longer to 
> complete. Note also the comment "This is not very efficient"  ... searches 
> would be even more harmful at that place in a script code. I do not 
> recommend it. Instead, use search requests from a client.
>
> Jörg
>
>
> On Thu, Jul 10, 2014 at 9:14 PM, Adrian > 
> wrote:
>
>> Ok, thanks.
>>
>> Though I've noticed that I can inject Node in the Factory method to get 
>> the client in the script as shown here: 
>> https://github.com/imotov/elasticsearch-native-script-example/blob/master/src/main/java/org/elasticsearch/examples/nativescript/script/LookupScript.java
>> .
>>
>> Couldn't I in this case perform a query through the client?
>>
>>
>>
>>
>> On Wednesday, 9 July 2014 19:03:49 UTC-4, Jörg Prante wrote:
>>
>>> Scripts are not an alternative to write filters or queries, they are 
>>> part of the filter or query computation. With scripts, you can access 
>>> document fields and perform evaluations depending on one or more field 
>>> values, for example, for custom scoring. Also, index term statistics can be 
>>> accessed from scripts. Script results are passed back to the query 
>>> processing.
>>>
>>> Jörg
>>>
>>>
>>> On Wed, Jul 9, 2014 at 10:41 PM, Adrian  wrote:
>>>
>>>> Hi,
>>>> I'm curious how scripts work in ES.
>>>> I've created a simple script which checks if a field in a record is 
>>>> equal to a certain value then return 1 otherwise return 0. I used 
>>>> AbstractDoubleSearchScript as the super class for this script.
>>>> My records contains some a data field that is an int and a date field 
>>>> (long) (some other ones too that I don't care too much yet)
>>>>
>>>> What I've noticed is that when I execute the script, each record in the 
>>>> ES db goes through the script and is returned either w a score of 0 or 1.
>>>> What I would like to know is if it is possible to do a range query, say 
>>>> given 2 date params, and select only the records that fall between those 
>>>> dates to run through the script.
>>>>
>>>> I know how to do this with the API but not in script format. For 
>>>> example this non script code works:
>>>>  SearchRequestBuilder s = client.prepareSearch("posts")
>>>> .setQuery(QueryBuilders.queryString(query).
>>>> defaultField("post.body"))
>>>> .setSize(count)
>>>> .setFrom(start)
>>>> .setFilter(FilterBuilders.rangeFilter("post.
>>>> creationDate").from(interval.getStart.toString()).to(
>>>> interval.getEnd.toString()))
>>>> .setFilter(FilterBuilders.existsFilter("userId"));
>>>> s.setSearchType(SearchType.DFS_QUERY_THEN_FETCH);
>>>>
>>>> But when extending AbstractDoubleSearchScript I have access only to a 
>>>> few things like doc(), source(), fields(), lookup() and I'm not sure how 
>>>> to 
>>>> filter records.
>>>> Also I can't find documentation on what these are ie: what is source() 
>>>> other than a Map.
>>>>
>>>> Any help on filtering is greatly appreciated!
>>>> Thanks
>>>>
>>>>
>>>>  -- 
>>>> You received this message because you are subscribed to the Google 
>>>> Groups "elasticsearch" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>> an email to elasticsearc...@googlegroups.com.
>>>>
>>>> To view this discussion on the web visit https://groups.google.com/d/
>>>> msgid/elasticsearch/1673cee1-468e-462b-85a4-c8e45fa74156%
>>>> 40googlegroups.com 
>>>> <https://groups.google.com/d/msgid/elasticsearch/1673cee1-468e-462b-85a4-c8e45fa74156%40googlegroups.com?utm_medium=email&utm

Re: native script in ES

2014-07-10 Thread Adrian
Ok, thanks.

Though I've noticed that I can inject Node in the Factory method to get the 
client in the script as shown here: 
https://github.com/imotov/elasticsearch-native-script-example/blob/master/src/main/java/org/elasticsearch/examples/nativescript/script/LookupScript.java.

Couldn't I in this case perform a query through the client?



On Wednesday, 9 July 2014 19:03:49 UTC-4, Jörg Prante wrote:
>
> Scripts are not an alternative to write filters or queries, they are part 
> of the filter or query computation. With scripts, you can access document 
> fields and perform evaluations depending on one or more field values, for 
> example, for custom scoring. Also, index term statistics can be accessed 
> from scripts. Script results are passed back to the query processing.
>
> Jörg
>
>
> On Wed, Jul 9, 2014 at 10:41 PM, Adrian > 
> wrote:
>
>> Hi,
>> I'm curious how scripts work in ES.
>> I've created a simple script which checks if a field in a record is equal 
>> to a certain value then return 1 otherwise return 0. I used 
>> AbstractDoubleSearchScript as the super class for this script.
>> My records contains some a data field that is an int and a date field 
>> (long) (some other ones too that I don't care too much yet)
>>
>> What I've noticed is that when I execute the script, each record in the 
>> ES db goes through the script and is returned either w a score of 0 or 1.
>> What I would like to know is if it is possible to do a range query, say 
>> given 2 date params, and select only the records that fall between those 
>> dates to run through the script.
>>
>> I know how to do this with the API but not in script format. For example 
>> this non script code works:
>>  SearchRequestBuilder s = client.prepareSearch("posts")
>> 
>> .setQuery(QueryBuilders.queryString(query).defaultField("post.body"))
>> .setSize(count)
>> .setFrom(start)
>> 
>> .setFilter(FilterBuilders.rangeFilter("post.creationDate").from(interval.getStart.toString()).to(interval.getEnd.toString()))
>> .setFilter(FilterBuilders.existsFilter("userId"));
>> s.setSearchType(SearchType.DFS_QUERY_THEN_FETCH);
>>
>> But when extending AbstractDoubleSearchScript I have access only to a few 
>> things like doc(), source(), fields(), lookup() and I'm not sure how to 
>> filter records.
>> Also I can't find documentation on what these are ie: what is source() 
>> other than a Map.
>>
>> Any help on filtering is greatly appreciated!
>> Thanks
>>
>>
>>  -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/1673cee1-468e-462b-85a4-c8e45fa74156%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/elasticsearch/1673cee1-468e-462b-85a4-c8e45fa74156%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/5b353881-4a58-48af-980b-7a0e5f2ae2cf%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Is it possible to pass 2 queries as params to a script

2014-07-10 Thread Adrian
I want to compare the results of 2 queries.

Is it possible to do 2 queries in one go and pass the result from query 1 
as a param 1 to a script and the result from the second query as param 2 to 
the same script? 

Thanks


-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c8d18165-eef2-409f-96bc-79f6a4803254%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


native script in ES

2014-07-09 Thread Adrian
Hi,
I'm curious how scripts work in ES.
I've created a simple script which checks if a field in a record is equal 
to a certain value then return 1 otherwise return 0. I used 
AbstractDoubleSearchScript as the super class for this script.
My records contains some a data field that is an int and a date field 
(long) (some other ones too that I don't care too much yet)

What I've noticed is that when I execute the script, each record in the ES 
db goes through the script and is returned either w a score of 0 or 1.
What I would like to know is if it is possible to do a range query, say 
given 2 date params, and select only the records that fall between those 
dates to run through the script.

I know how to do this with the API but not in script format. For example 
this non script code works:
 SearchRequestBuilder s = client.prepareSearch("posts")

.setQuery(QueryBuilders.queryString(query).defaultField("post.body"))
.setSize(count)
.setFrom(start)

.setFilter(FilterBuilders.rangeFilter("post.creationDate").from(interval.getStart.toString()).to(interval.getEnd.toString()))
.setFilter(FilterBuilders.existsFilter("userId"));
s.setSearchType(SearchType.DFS_QUERY_THEN_FETCH);

But when extending AbstractDoubleSearchScript I have access only to a few 
things like doc(), source(), fields(), lookup() and I'm not sure how to 
filter records.
Also I can't find documentation on what these are ie: what is source() 
other than a Map.

Any help on filtering is greatly appreciated!
Thanks


-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1673cee1-468e-462b-85a4-c8e45fa74156%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Tutorial on Java interface to ElasticSearch?

2014-07-08 Thread Adrian
On Thu, Jul 03, 2014 at 09:20:05AM -0700, Ivan Brusic wrote:

Ivan,

> Currently the best way to learn the Java API is to view the Elasticsearch
> search code.

Or just sift through the generated Java API Documentation. You can find some 
at: http://javadoc.kyubu.de/elasticsearch.

Best, Adrian

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/20140708222125.GA25315%40server1.850nm.net.
For more options, visit https://groups.google.com/d/optout.


Re: reverse_nested aggregation facing troubles when applied to array of nested objects

2014-06-17 Thread Adrian Luna
If not possible. Is there any other way to aggregate by several fields? Or 
do I need to make 2 different aggregations and then merge them in my 
application?

El martes, 17 de junio de 2014 15:11:26 UTC+2, Adrian Luna escribió:
>
> Ok, just realized something. The problem wasn't related to this. But in 
> order to use the 1.2 version (which first expose this reverse_nested 
> functionallity), something seem to change from the 1.1 version I was using 
> before.
>
> Something I usually did before is aggregation by several fields using the 
> same aggregation name in order to "merge" the results (which I must 
> recognize, I have never seen documented). I mean:
>
> {
>  "aggs":{
>"forms":[
>  {"terms":{"field":"object_of_type_a.form"}},
>  {"terms":{"field":"object_of_type_b.form"}}
>]
>  }
> }
>
> Such thing was working on previous versions, but not anymore?
>
> Thanks in advance
>
> El martes, 17 de junio de 2014 14:18:08 UTC+2, Adrian Luna escribió:
>>
>> I have an issue where my mapping includes an array of nested objects. 
>> Let's imagine something simplified like this:
>>
>> 
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a1220658-8eca-4b30-b595-3a77b1de8a90%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: reverse_nested aggregation facing troubles when applied to array of nested objects

2014-06-17 Thread Adrian Luna
If not possible. Is there any other way to aggregate by several fields?

El martes, 17 de junio de 2014 15:11:26 UTC+2, Adrian Luna escribió:
>
> Ok, just realized something. The problem wasn't related to this. But in 
> order to use the 1.2 version (which first expose this reverse_nested 
> functionallity), something seem to change from the 1.1 version I was using 
> before.
>
> Something I usually did before is aggregation by several fields using the 
> same aggregation name in order to "merge" the results (which I must 
> recognize, I have never seen documented). I mean:
>
> {
>  "aggs":{
>"forms":[
>  {"terms":{"field":"object_of_type_a.form"}},
>  {"terms":{"field":"object_of_type_b.form"}}
>]
>  }
> }
>
> Such thing was working on previous versions, but not anymore?
>
> Thanks in advance
>
> El martes, 17 de junio de 2014 14:18:08 UTC+2, Adrian Luna escribió:
>>
>> I have an issue where my mapping includes an array of nested objects. 
>> Let's imagine something simplified like this:
>>
>> 
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/7d5dce21-abbb-4fa9-b326-9d9e73da4097%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: reverse_nested aggregation facing troubles when applied to array of nested objects

2014-06-17 Thread Adrian Luna
Ok, just realized something. The problem wasn't related to this. But in 
order to use the 1.2 version (which first expose this reverse_nested 
functionallity), something seem to change from the 1.1 version I was using 
before.

Something I usually did before is aggregation by several fields using the 
same aggregation name in order to "merge" the results (which I must 
recognize, I have never seen documented). I mean:

{
 "aggs":{
   "forms":[
 {"terms":{"field":"object_of_type_a.form"}},
 {"terms":{"field":"object_of_type_b.form"}}
   ]
 }
}

Such thing was working on previous versions, but not anymore?

Thanks in advance

El martes, 17 de junio de 2014 14:18:08 UTC+2, Adrian Luna escribió:
>
> I have an issue where my mapping includes an array of nested objects. 
> Let's imagine something simplified like this:
>
> 
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/aed8df74-8e91-4d4f-a0c0-da2f895bf6c3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


reverse_nested aggregation facing troubles when applied to array of nested objects

2014-06-17 Thread Adrian Luna
I have an issue where my mapping includes an array of nested objects. Let's 
imagine something simplified like this:

{
"properties":{
   "datetime":{"type":"date"},
   "tags":{"type":"object","properties":{
  
 "object_of_type_a":{"type":"nested","properties":{"##SOME FIELDS##"}},
  
 "object_of_type_b":{"type":"nested","properties":{"##SOME FIELDS##"}},
}
}
}

Both object_of_type_a and object_of_type_b are arrays of the actual nested 
object. 

So, one doc may look like:

{
"datetime":"17-06-2014T14:11",
"##other fields I don't care about right now##",
"tags":{
  "object_of_type_a":[{"form":"whatever",...},{"form":"another thing",...}],
  "object_of_type_b":[{"form":"something else",...},{"form":"others",...}],
}
}



Now imagine I want to aggregate for each element of some of the fields from 
one of the inner objects, but also obtain their histogram based on the 
top-level field ("datetime").
 
"aggs": {
"top_agg": {
  "nested": {
"path": "tags.object_of_type_a"
  },
  "aggs": {
"medium_agg": {
  "terms": {
"size": 5,
"field": "tags.object_of_type_a.form"
  },
  "aggs": {
"reverse": {
  "reverse_nested": {},
  "aggs": {
"timeline": {
  "date_histogram": {
"field": "datetime",
"interval": "day"
  }
}
  }
}
  }
}
  }
}



Once I try to do so, I am getting an error:

Parse Failure [Aggregation definition for [object_of_type_a starts with a 
[START_ARRAY], expected a [START_OBJECT].]]; }


Is it possible to perform such an aggregation?
Thanks in advance. Really appreciate any help you can provide..

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/104ffef4-8bd8-4422-9a19-b3b4a31ff7ec%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Levenshtein distance

2014-05-29 Thread Adrian C
Resolved this by setting transpositions true on the request - didn't see 
this option documented but found it by looking through the source.

{
   "size":50,
   "query":{
  "fuzzy":{
 "surname":{
"value":"arosn",
"transpositions":true,
"fuzziness":2,
"prefix_length":1,
    "max_expansions":100
 }
  }
   }
}

On Wednesday, 28 May 2014 13:10:22 UTC+1, Adrian C wrote:
>
> Hi,
>
> I am new to ES and have been doing some simple testing of fuzzy matching. 
> I have a query related to Levenshtein distance. Does ElasticSearch 
> use Levenshtein distance or Damerau–Levenshtein distance?
>
> For example I have the following text stored in an index (analyzer: 
> simple):
> AARONS
>
> When search using 'arosn' the text is not found. The queries that I have 
> been testing with are as follows:
>
> {
>"size":50,
>"query":{
>   "fuzzy":{
>  "surname":{
> "value":"arosn",
> "fuzziness":2,
> "prefix_length":1,
> "max_expansions":100
>  }
>   }
>}
> }
>
> and 
>
> {
>"size":50,
>"query":{
>   "match":{
>  "surname":{
> "query":"arosn",
> "fuzziness":2
>  }
>   }
>}
> }
>
> {
>"size":50,
>"query":{
>   "match":{
>  "surname":{
> "query":"arosn~",
> "fuzziness":2
>  }
>   }
>}
> }
>
> {
>"size":50,
>"query":{
>   "query_string":{
>  "default_field":"surname",
>  "fuzziness":2,
>  "query":"arosn~2"
>   }
>}
> }
>
> If the Damerau–Levenshtein distance algorithm was is use then I would 
> expect this to match with a distance of two:
>
> arosn + (a) à aarosn + swap (n & s) à aarons
>
> I am a little confused as there is reference to Damerau–Levenshtein: 
> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-query-string-query.html#_fuzziness
>
> So any ideas on how I can get Damerau–Levenshtein to work?
>
> Thanks
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/e4cf61f0-1c6c-42d4-a01a-e51598bdf196%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Levenshtein distance

2014-05-28 Thread Adrian C
Hi,

I am new to ES and have been doing some simple testing of fuzzy matching. I 
have a query related to Levenshtein distance. Does ElasticSearch 
use Levenshtein distance or Damerau–Levenshtein distance?

For example I have the following text stored in an index (analyzer: simple):
AARONS

When search using 'arosn' the text is not found. The queries that I have 
been testing with are as follows:

{
   "size":50,
   "query":{
  "fuzzy":{
 "surname":{
"value":"arosn",
"fuzziness":2,
"prefix_length":1,
"max_expansions":100
 }
  }
   }
}

and 

{
   "size":50,
   "query":{
  "match":{
 "surname":{
"query":"arosn",
"fuzziness":2
 }
  }
   }
}

{
   "size":50,
   "query":{
  "match":{
 "surname":{
"query":"arosn~",
"fuzziness":2
 }
  }
   }
}

{
   "size":50,
   "query":{
  "query_string":{
 "default_field":"surname",
 "fuzziness":2,
 "query":"arosn~2"
  }
   }
}

If the Damerau–Levenshtein distance algorithm was is use then I would 
expect this to match with a distance of two:

arosn + (a) à aarosn + swap (n & s) à aarons

I am a little confused as there is reference to Damerau–Levenshtein: 
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-query-string-query.html#_fuzziness

So any ideas on how I can get Damerau–Levenshtein to work?

Thanks

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c9b9fb8b-d1f4-46d8-9426-a1dc1a729c9a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


fuzziness & score computation

2014-03-20 Thread Adrian Luna
Hi, 

Sorry that I am relatively fresh to elasticsearch so please don't be too 
harsh.

I feel like I'm not being able to understand the behaviour of any of the 
fuzzy queries in ES.

*1) match with fuzziness enabled*

{
  "query": {
"fuzzy_like_this_field": {
  "field_name": {
"like_text": "car renting London",
"fuzziness": "0.5"
  }
}
  }
}

As I see it from my tests, this kind of query will give same score to 
documents with field_name="car renting London" and "car ranting London" or 
"car renting Londen" for example. That means, it will not give any 
negatively score misspellings. I can imagine that first the possible 
variants are computed and then the score is just computed with a 
"representative score" which is the same for every variant that match the 
requirements. 

Am I right? If I am, is it any way to boost the exact match over the fuzzy 
match?

Also I get results with more terms getting the same score, like "cheap car 
renting London", "offers car renting London". That's something I cannot get 
to understand. When I use the explain API, it seems that the resulting 
score is a sum of the different matches with its internal weightings, 
tf-idf, etc. but it seems to not be considering the terms outside the 
query, while I would expect the exact match to score at least slightly 
higher. 

Am I missing something here? Is it just the expected result and I am just 
being too demanding?

*2) fuzzy query*

That doesn't make what I want since it does not analyze the query (I think) 
and so it will treat the query in an unexpected way for my purposes of 
"free text" search

*3) fuzzy_like_this or fuzzy_like_this_field*

This other search takes rid of the first problem in point 1, since as I 
read from the documentation, it seems to use some tricks to avoid favouring 
rare terms (misspellings will be here) over more frequent terms, etc. but 
it's still giving the same score to exact match and matches where other 
terms are present. 

Is there any way to get the expected behaviour?. By this I mean to be able 
to execute almost free-text queries with some fuzziness to take rid of 
possible misspellings in the query terms, but with an (at least for me) 
more exhaustive score computation. If not, is there any other more complex 
query or a function_score to get such a performance.

Thank you very much, any comment will be pretty much appreciated. Also, if 
I am not right in my suppositions, any clarification will be very welcome.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/916f5408-ecfd-4676-8d48-db4467a9d839%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Sorting date fields

2014-03-04 Thread Adrian
On Wed, Feb 26, 2014 at 12:21:26AM +0100, joergpra...@gmail.com wrote:

Jörg,

sorry for the late answer.

> Maybe you can set up an example of your sort  as a demo, so that the error
> can be reproduced?

It turned out that this behaviour was caused by me, since documents contained
the wrong timestamp on the sorted field. After fixing this, the results were as
expected.

Thanks for the pointers and your help,

Adrian

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/20140304221819.GA31519%40server1.850nm.net.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Sorting date fields

2014-02-25 Thread Adrian
On Tue, Feb 25, 2014 at 11:11:13PM +0100, joergpra...@gmail.com wrote:

Jörg,

> ES loads the values of the fields to sort on into memory cache.

Yes, I've read that - is it known when these caches are flushed?

> You should update to 1.0.0, maybe you hit a bug that has been fixed.

I'll do that. I am just wondering if I am missing something .. 

Best regards, Adrian

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/20140225222829.GB22436%40server1.850nm.net.
For more options, visit https://groups.google.com/groups/opt_out.


Sorting date fields

2014-02-25 Thread Adrian
Hi all,

I have a question on how sorting during queries works in elasticsearch. 

I have an index with a custom date format field, on which the sort is applied.
When quering the index for a given keywork, results are provided with the given
sort.

However, I've observed that some documents are not present in the result set. I
would have expected these results to be part of the result set as it would be
in relational systems using the SQL ORDER BY statement.  I've verified that
these missing documents are covered by the query using the explain api.
According to the documentation, score computation ist not performed when using
sorts on fields.

Maybe someone can provide more information on how sorting is done? 

I am using Elasticsearch 1.0.0RC1 on debian whezzy with openjdk7-jdk.

Thanks, Adrian

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/20140225212301.GA22436%40server1.850nm.net.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Elasticsearch 1.0.0 is now GA

2014-02-21 Thread Adrian
On Sat, Feb 22, 2014 at 08:57:24AM +1100, Mark Walkom wrote:

Mark,

> > > Removing and re-installing the ES package either removes the original or
> > >  the existing elasticsearch.yml
> > Does this happen for the debian packages as well?
> apt will ask you if you want to keep it, overwrite it, compare it etc.

This is what I would have expceted.

Best, Adrian

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/20140221221409.GA5327%40server1.850nm.net.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Elasticsearch 1.0.0 is now GA

2014-02-21 Thread Adrian
On Mon, Feb 17, 2014 at 02:14:46PM -0800, Tony Su wrote:

> What?!
>  
> Removing and re-installing the ES package either removes the original or 
>  the existing elasticsearch.yml

Does this happen for the debian packages as well?

Best, Adrian

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/20140221211042.GA5120%40server1.850nm.net.
For more options, visit https://groups.google.com/groups/opt_out.


Logstash & Elasticsearch - Inserting / Updating data

2014-01-31 Thread Adrian Moreno
I'm testing out Logstash and ElasticSearch on my local dev (Win 7) as a 
replacement for our current SQL Server based search pages.

I'm using the current Logstash config to import a folder full of CSV files 
(pipe delimited) into ElasticSearch:

---
input {
  stdin {
type => "stdin-type"
  }

  file {
path => ["C:/Users/.../export.csv"]
  }
}

filter {
  csv {
columns => 
["property_id","postal_code","status_id","address_1","city","state"]
separator => "|"
  }
}

output {
  elasticsearch {
embedded => true
index => "assets"
index_type => "asset"
  }
}
---

1. Sometimes it imports, sometime it doesn't. I've deleted the .sincedb 
files over and over and have changed the index name to make sure it's going 
in correctly (when it actually runs the import). Any idea why it's sporadic?

2. I have a data set of over a million records.the "_id" value of each 
record in ES is, of course, a unique string. If I add a new CSV file with 
updates for 100 records, how does Logstash or ES know how to match an 
update to an existing record? In the original data set, the "property_id" 
value is the primary key.

I looked 
at http://logstash.net/docs/1.3.3/outputs/elasticsearch#document_id , which 
seems to be the correct setting for the import, but what value? I tried 
"property_id", the first column name, but that doesn't work. The import 
doesn't even run with that setting.

Any help would be appreciated. Thanks.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ca71517f-f950-4d63-9340-57acf35e45f6%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Is there a way to return only one result in a query??

2013-12-20 Thread Adrian Luna
I want to have several processes reindexing documents with updated 
information if it isn't already updated/being updated by other process. 
Thus, with the query I'll check a flag, and update its state to processing 
(atomic operation). The next process will come up and select a still 
non-processed document, since the flag is updated.

I guess if the plugin allows to do this for the complete set of matched 
documents, it should be possible to do it for just 1 doc.

Thanks in advance.

El viernes, 20 de diciembre de 2013 10:26:36 UTC+1, David Pilato escribió:
>
> Why do you want to do this? For test purpose?
>
> -- 
> *David Pilato* | *Technical Advocate* | *Elasticsearch.com*
> @dadoonet <https://twitter.com/dadoonet> | 
> @elasticsearchfr<https://twitter.com/elasticsearchfr>
>
>
> Le 20 décembre 2013 at 09:49:09, Adrian Luna 
> (adrian.l...@gmail.com) 
> a écrit:
>
> Any clue? 
>
> What I actually want is just a "update () set () where ()  limit 1" in sql 
> syntax.
> If possible, also with an aditional  "order by (column) desc".
>
> Is that possible at all in ES???
>
> Thanks!
>
> El jueves, 19 de diciembre de 2013 15:52:21 UTC+1, Adrian Luna escribió: 
>>
>> I am using the update by query 
>> plugin<https://github.com/yakaz/elasticsearch-action-updatebyquery>  
>> . I only want to update one of the elements that match my query at a 
>> time and the "size=1" is not working. Is there any other possibility to set 
>> the number of returned elements to 1. 
>>
>> Thanks for the info. 
>>  
>  --
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearc...@googlegroups.com .
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/b18371ed-5112-47e7-ae25-c002c9448156%40googlegroups.com
> .
> For more options, visit https://groups.google.com/groups/opt_out.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a8221681-276b-49ec-bf03-e26c93c042b1%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Is there a way to return only one result in a query??

2013-12-20 Thread Adrian Luna
Any clue?

What I actually want is just a "update () set () where ()  limit 1" in sql 
syntax.
If possible, also with an aditional  "order by (column) desc".

Is that possible at all in ES???

Thanks!

El jueves, 19 de diciembre de 2013 15:52:21 UTC+1, Adrian Luna escribió:
>
> I am using the update by query 
> plugin<https://github.com/yakaz/elasticsearch-action-updatebyquery>  
> . I only want to update one of the elements that match my query at a 
> time and the "size=1" is not working. Is there any other possibility to set 
> the number of returned elements to 1.
>
> Thanks for the info. 
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/b18371ed-5112-47e7-ae25-c002c9448156%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Is there a way to return only one result in a query??

2013-12-20 Thread Adrian Luna
Any clue?

What I actually want is just a "update (table) set (column) where 
(predicate) order by (column) desc limit 1" in sql syntax. Is that possible 
at all in ES???

Thanks!

El jueves, 19 de diciembre de 2013 15:52:21 UTC+1, Adrian Luna escribió:
>
> I am using the update by query 
> plugin<https://github.com/yakaz/elasticsearch-action-updatebyquery>  
> . I only want to update one of the elements that match my query at a 
> time and the "size=1" is not working. Is there any other possibility to set 
> the number of returned elements to 1.
>
> Thanks for the info. 
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/d7b9cb4f-0129-446d-87a3-38597302644c%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Is there a way to return only one result in a query??

2013-12-19 Thread Adrian Luna
I am using the update by query 
plugin  
. I only want to update one of the elements that match my query at a 
time and the "size=1" is not working. Is there any other possibility to set 
the number of returned elements to 1.

Thanks for the info. 

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/24ed5bec-c3e3-4969-b62d-d6d479aabf53%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.