date:20140608

Elasticsearch stemmer issue

2014-06-08 Thread Александр Шаманов

Hello everyone,

I have follow index mapping:


curl -XPUT 'http://localhost:9200/some_content/' -d '
{
   "settings":{
  "query_string":{
 "default_con":"content",
 "default_operator":"AND"
  },
  "index":{
 "analysis":{
"analyzer":{
   "en_analyser":{
  "filter":[
 "snowBallFilter"
  ],
  "type":"custom",
  "tokenizer":"standard"
   }
},
"filter":{
   "en_stopFilter":{
  "type":"stop",
  "stopwords_path":"lang/stopwords_en.txt"
   },
   "snowBallFilter":{
  "type":"snowball",
  "language":"English"
   },
   "wordDelimiterFilter":{
  "catenate_all":false,
  "catenate_words":true,
  "catenate_numbers":true,
  "generate_word_parts":true,
  "generate_number_parts":true,
  "preserve_original":true,
  "type":"word_delimiter",
  "split_on_case_change":true
   },
   "en_synonymFilter":{
  "synonyms_path":"lang/synonyms_en.txt",
  "ignore_case":true,
  "type":"synonym",
  "expand":false
   },
   "lengthFilter":{
  "max":250,
  "type":"length",
  "min":3
   }
}
 }
  }
   },
   "mappings":{
  "docs":{
 "_source":{
"enabled":false
 },
 "analyzer":"en_analyser",
 "properties":{
 "content":{
"type":"string",
"index":"analyzed",
"term_vector":"with_positions_offsets",
"omit_norms":"true"
 }
 }
  }
   }
}'

and I posted the next content:

curl -XPOST http://localhost:9200/some_content/docs/ -d '
{  
  "content" : "Some sampling text formatted for text data" 
}'

When I make this one request:
http://epbyvitw0052:9200/some_content/docs/_search?q=sampling
 
 I'm getting result:

{
"took": 1,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.095891505,
"hits": [
{
"_index": "some_content",
"_type": "docs",
"_id": "saLfx6PYR82YR69je0JbAA",
"_score": 0.095891505
}
]
}
}
 

but when I send request without type:
http://epbyvitw0052:9200/some_content/_search?q=sampling

then I'm getting nothing:

{
"took": 1,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 0,
"max_score": null,
"hits": []
}
}


although, I can make the next request with term:
http://epbyvitw0052:9200/some_content/_search?q=sampl

the system found it:

{
"took": 1,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.095891505,
"hits": [
{
"_index": "some_content",
"_type": "docs",
"_id": "saLfx6PYR82YR69je0JbAA",
"_score": 0.095891505
}
]
}
}
 

It's issue appear when I put some stemmer into analyzer. 
Could you explain why the system has such behavior? 
May be I do something wrong.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/e311d613-4411-4b70-b800-05f6be9ad5cb%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

restore snapshot not working

2014-06-08 Thread Hermanto Phang

Hi All,

I try to restore my backup snapshot using command 
curl -XPOST "localhost:9200/_snapshot/my_backup/snapshot_1/_restore"

But it was come up error 

{"error":"SnapshotRestoreException[[my_backup:snapshot_1] cannot restore 
index [cobbler_api] because it's open]","status":500}


Any input to fix this error?
Do I need to close the index before restore and how can I do that?
Thanks in advance for the input.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/19b2fe55-29a6-4936-b12b-fdae6d9e6fe8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re-creating ES Index

2014-06-08 Thread karthik jayanthi

Hi, 

I have a few questions with respect to the situation of needing to 
re-create an index. 

1) Is there is any process to re-create an index apart from deleting the 
current one and creating it again with the documents  ? 
2) During the situations of re-create, can we use the data already stored 
in the ES cluster ? 
3) Are there any easy ways of the doing the same without have any down-time 
to the running ES cluster ? 

Appreciate any response on the same. 



Thanks, 
Karthik

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/e141b51d-99aa-4a78-b20a-2d9dcfe2e79b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Queries, filters and match_all

2014-06-08 Thread Arkadiy Zabazhanov

Guys, I still need help, A've tried to change filtered query strategies. It 
returns all the filtered results anyway for versions 1.0.0 - 1.2.1. When 
this behavior was changed and how? Why don't I need match_all for filtered 
query with empty query?

пятница, 6 июня 2014 г., 7:14:28 UTC+7 пользователь Arkadiy Zabazhanov 
написал:
>
> Yeah, I've got ehis already, thanks.
>
> I'm still confused why filtered query is returning all results even 
> without match_all in filtered query.
>
> четверг, 5 июня 2014 г., 6:21:03 UTC+7 пользователь Ivan Brusic написал:
>>
>> There is no label, but the change was made last December:
>>
>> https://github.com/elasticsearch/elasticsearch/pull/4461
>>
>> It appears that the REST API still supports the old notation, but the 
>> change did break Java backwards compatibility
>>
>>
>> https://github.com/elasticsearch/elasticsearch/blob/master/src/main/java/org/elasticsearch/search/query/QueryPhase.java#L71
>>
>> -- 
>> Ivan
>>
>>
>>
>> On Tue, Jun 3, 2014 at 8:11 PM, Arkadiy Zabazhanov  
>> wrote:
>>
>>> Btw, Answer for the second question is top-level filter was renamed to 
>>> post_filter. That's awesome. So the first question is answered too. 
>>> Filtered query is preferred.
>>> Still waiting for an answer for the third question. Since I didn't find 
>>> filter to post_filter renaming in changelog (
>>> http://www.elasticsearch.org/downloads/1-0-0/) and I can't find 
>>> anything about new query behavior. I need just version where was it 
>>> changed, please.
>>>
>>> вторник, 3 июня 2014 г., 19:27:17 UTC+7 пользователь Arkadiy Zabazhanov 
>>> написал:
>>>
 Hello. Help me please, I'm confused. As far as I remember, there was 
 the only way to pass filters to search query - via filtered query. But 
 currently there is a top-level filter part of the query. However, 
 top-level filter affects query only and doesn't affect i.e. facets. 
 But filtered query filter affects both of the query and facets facilities. 
 Also, I remember there was a time I need to add match_all query to 
 filtered query section if query was empty and filters only was 
 present. Otherwise returned empty set of documents. Since I'm trying to 
 create high-level Ruby library could you please answer following questions:

 1) Which way is preferred now and in future: filtered top-level query 
 or top-level filter with top-level query?
 2) How do you plan to resolve such an API inconsistency when filtered 
 query filter affects outside statements and top-level filter doesn't 
 affect 
 some parts of request?
 3) Why do I remember about match_all feature and when did requests 
 started to return all the documents with empty query section in filtered 
 query? I'm checking it right now on 1.2.0 and I don't need to use 
 match_all, or constant_score it just returns all the docs for me.

 Thanks in advance.

>>>  -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to elasticsearc...@googlegroups.com.
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/elasticsearch/c8bddc46-7347-4ca9-a9ea-65100a017673%40googlegroups.com
>>>  
>>> 
>>> .
>>>
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/811a5138-21d1-4ad0-a051-510a7494be65%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Losing data after Elasticsearch restart

2014-06-08 Thread Rohit Jaiswal

Hello Everyone,
 We lost data after restarting Elasticsearch 
cluster. Restarting is a part of deploying our software stack. 

 We have a 20-node cluster running 0.90.2 and we 
have Splunk configured to index ES logs.

 Looking at the Splunk logs, we could find the 
following *error a day before the deployment* (restart) - 

[cluster.action.shard ] [Rictor] sending failed shard for 
[c0a71ddaa70b463a9a179c36c7fc26e3][2], node[nJvnclczRNaLbETunjlcWw], [R], 
s[STARTED], reason
[Failed to perform [bulk/shard] on replica, message 
[RemoteTransportException; nested: ResponseHandlerFailureTransportException; 
nested: NullPointerException; ]]

[cluster.action.shard ] [Kiss] received shard failed for 
[c0a71ddaa70b463a9a179c36c7fc26e3][2], node[nJvnclczRNaLbETunjlcWw], [R], 
s[STARTED], reason 
[Failed to perform [bulk/shard] on replica, message 
[RemoteTransportException; nested: ResponseHandlerFailureTransportException; 
nested: NullPointerException; ]]  



  Further,* a day after the deploy,* we see the 
same errors on another node - 



[cluster.action.shard ] [Contrary] received shard failed 
for [a58f9413315048ecb0abea48f5f6aae7][1], node[3UbHwVCkQvO3XroIl-awPw], [R], 
s[STARTED], reason
[Failed to perform [bulk/shard] on replica, message 
[RemoteTransportException; nested: ResponseHandlerFailureTransportException; 
nested: NullPointerException; ]]

   
 *Immediately next, the following error is seen*. This error is 
seen repeatedly on a couple of other nodes as well - 
 
 failed to start shard
 
 [cluster.action.shard ] [Copperhead] sending failed shard 
for [a58f9413315048ecb0abea48f5f6aae7][0], node[EuRzr3MLQiSS6lzTZJbiKw], [R], 
s[INITIALIZING],
 reason [Failed to start shard, message 
[RecoveryFailedException[[a58f9413315048ecb0abea48f5f6aae7][0]: Recovery failed 
from [Frank Castle][dlv2mPypQaOxLPQhHQ67Fw]
 [inet[/10.2.136.81:9300]] into 
[Copperhead][EuRzr3MLQiSS6lzTZJbiKw][inet[/10.3.207.55:9300]]]; nested: 
RemoteTransportException[[Frank Castle]
 
[inet[/10.2.136.81:9300]][index/shard/recovery/startRecovery]]; nested: 
RecoveryEngineException[[a58f9413315048ecb0abea48f5f6aae7][0] Phase[2] 
Execution failed]; 
 nested: 
RemoteTransportException[[Copperhead][inet[/10.3.207.55:9300]][index/shard/recovery/translogOps]];
 nested: InvalidAliasNameException[[a58f9413315048ecb0abea48f5f6aae7]

* Invalid alias name 
[fbf1e55418a2327d308e7632911f9bb8bfed58059dd7f1e4abd3467c5f8519c3], Unknown 
alias name was passed to alias Filter]; ]]*

 
*During this time, we could not access previously indexed documents.*
 I looked up the alias error, looks like it is related to 
https://github.com/elasticsearch/elasticsearch/issues/1198 (Delete By Query 
wrongly persisted to translog # 1198),
 but this should be fixed in ES 0.18.0 and, we are using 0.90.2, so 
why is ES encountering this issue?

 What do we need to do to set this right and get back lost data? 
Please help.

Thanks.

 



-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/00e54753-ab89-4f63-a39e-0931e8f7e2f0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Sort nested documents in search result

2014-06-08 Thread Zdenek Pizl

Hallo,

 let's say I have following structure of document with nested (multiple) 
documents :

{
  "host": "server-001", 
   "plugins": [
{
  "plugin_name": "function-c", 
  "plugin": {"function-c": "some C content" }
}, 
{
  "plugin_name": "function-a", 
  "plugin": {"function-a": "some A content" }
}, 
{
  "plugin_name": "function-b", 
  "plugin": {"function-b": "some B content" }
}
   ]
}


I would like to get result of search with sorted nested plugins' objects, 
that is JSON with plugin_name item in 'function-a, function-b, function-c' 
sequence of nested documents. Is it even possible to achieve? If it is, how 
should the query and/or mapping look like?

I've tried i.e. '{ "query": { "match": { "host": "server-001" }}, "sort": { 
"plugins.plugin_name" : "asc" }}}' but it does not sort the nested block at 
all.

Thank you, regards Z. Pizl.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/09751eee-fd17-416a-98ba-f025ee381482%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: compresstion in ES 1.2.1

2014-06-08 Thread sri

Hello Jorg,

Thanks a lot for the info., i tried applying the template provided by you 
but the size is not reducing.On the other hand, I was noticing decrease in 
size when i was disabling the fields via Mapping API.

Thanks and Regards
Sri

On Sunday, June 8, 2014 4:37:58 PM UTC-4, Jörg Prante wrote:
>
> Try this index template for new index creations
>
> curl -XPUT 'localhost:9200/_template/template1' -d '
> {
> "template" : "*",
> "mappings" : {
> "_default_" : {
> "_source" : { "enabled" : false },
> "_all" : { "enabled" : false}
> }
> }
> }
> '
>
> See also 
>
>
> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-templates.html
>
> You can not disable _all or _source in an existing index.
>
> Jörg
>
>
>
> On Sun, Jun 8, 2014 at 10:22 PM, sri <1.fr...@gmail.com > 
> wrote:
>
>> Thanks a lot for the insight Patrick. 
>>
>> I have a few more queries:
>>
>>- it is possible to disable the '_source' and '_all' fields by 
>>default for all the indices that would be created later (possibility 
>> define 
>>in the elasticsearch.yml file) 
>>- what happens if my index is created and then i disable '_source' 
>>and '_all' fields, would that effect the file size of the index, i.e., 
>> will 
>>the fields be removed/disabled for only the documents that will be added 
>>after the disabling the fields?? 
>>
>> Thanks and Regards
>> Sri
>>
>> On Sunday, June 8, 2014 2:48:16 PM UTC-4, Patrick Proniewski wrote:
>>
>>> Hello, 
>>>
>>> I don't know how it's compressed but it appears that data is compressed 
>>> up to an amount of 4k. ie. it's useless to store data on a compressed (lz4) 
>>> filesystem if fs block size is 4k: 
>>>
>>> Filesystem SizeUsed   Avail Capacity  Mounted on 
>>> zdata/ES-lz4   1.1T1.9G1.1T 0%/zdata/ES-lz4 
>>> zdata/ES   1.1T1.9G1.1T 0%/zdata/ES 
>>>
>>> But if fs block size is greater (say 128k), filesystem compression is a 
>>> huge win: 
>>>
>>> Filesystem SizeUsed   Avail Capacity  Mounted on 
>>> zdata/ES-lz4   1.1T1.1G1.1T 0%   
>>>  /zdata/ES-lz4-> compressratio  1.73x 
>>> zdata/ES-gzip  1.1T901M1.1T 0%   
>>>  /zdata/ES-gzip-> compressratio  2.27x 
>>> zdata/ES   1.1T1.9G1.1T 0%/zdata/ES 
>>>
>>> Unfortunately, a filesystem block size greater than 4K is not optimal 
>>> for IO (unless you have a big amount of physical memory you can dedicate to 
>>> filesystem data cache, which would be redundant with ES cache). 
>>>
>>>
>>>
>>> On 08 juin 2014, at 18:41, David Pilato wrote: 
>>>
>>> > It's compressed by default now. 
>>> > 
>>> > -- 
>>> > David ;-) 
>>> > Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs 
>>> > 
>>> > 
>>> > Le 8 juin 2014 à 18:01, sri <1.fr...@gmail.com> a écrit : 
>>> > 
>>> > Hello everyone, 
>>> > 
>>> > I have read posts and blogs on how elasticsearch compression can be 
>>> enabled in the previous versions(0.17 - 0.19). 
>>> > 
>>> > I am currently using ES 1.2.1, i wasn't able to find out how to enable 
>>> compression in this version or if at all there is any such option for it. 
>>> > 
>>> > I know that i can reduce the storage amount by disabling the source 
>>> using the mapping api, but what i was interested is the compression of data 
>>> storage. 
>>> > 
>>> > Thanks and Regards 
>>> > Sri 
>>>
>>>  -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/ea1e6264-9694-47b0-98d1-992c67bbb63d%40googlegroups.com
>>  
>> 
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a61f2eda-9c6e-4981-bde1-15d18bff5fd7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: compresstion in ES 1.2.1

2014-06-08 Thread joergpra...@gmail.com

Try this index template for new index creations

curl -XPUT 'localhost:9200/_template/template1' -d '
{
"template" : "*",
"mappings" : {
"_default_" : {
"_source" : { "enabled" : false },
"_all" : { "enabled" : false}
}
}
}
'

See also

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-templates.html

You can not disable _all or _source in an existing index.

Jörg



On Sun, Jun 8, 2014 at 10:22 PM, sri <1.fr@gmail.com> wrote:

> Thanks a lot for the insight Patrick.
>
> I have a few more queries:
>
>- it is possible to disable the '_source' and '_all' fields by default
>for all the indices that would be created later (possibility define in the
>elasticsearch.yml file)
>- what happens if my index is created and then i disable '_source' and
>'_all' fields, would that effect the file size of the index, i.e., will the
>fields be removed/disabled for only the documents that will be added after
>the disabling the fields??
>
> Thanks and Regards
> Sri
>
> On Sunday, June 8, 2014 2:48:16 PM UTC-4, Patrick Proniewski wrote:
>
>> Hello,
>>
>> I don't know how it's compressed but it appears that data is compressed
>> up to an amount of 4k. ie. it's useless to store data on a compressed (lz4)
>> filesystem if fs block size is 4k:
>>
>> Filesystem SizeUsed   Avail Capacity  Mounted on
>> zdata/ES-lz4   1.1T1.9G1.1T 0%/zdata/ES-lz4
>> zdata/ES   1.1T1.9G1.1T 0%/zdata/ES
>>
>> But if fs block size is greater (say 128k), filesystem compression is a
>> huge win:
>>
>> Filesystem SizeUsed   Avail Capacity  Mounted on
>> zdata/ES-lz4   1.1T1.1G1.1T 0%
>>  /zdata/ES-lz4-> compressratio  1.73x
>> zdata/ES-gzip  1.1T901M1.1T 0%
>>  /zdata/ES-gzip-> compressratio  2.27x
>> zdata/ES   1.1T1.9G1.1T 0%/zdata/ES
>>
>> Unfortunately, a filesystem block size greater than 4K is not optimal for
>> IO (unless you have a big amount of physical memory you can dedicate to
>> filesystem data cache, which would be redundant with ES cache).
>>
>>
>>
>> On 08 juin 2014, at 18:41, David Pilato wrote:
>>
>> > It's compressed by default now.
>> >
>> > --
>> > David ;-)
>> > Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
>> >
>> >
>> > Le 8 juin 2014 à 18:01, sri <1.fr...@gmail.com> a écrit :
>> >
>> > Hello everyone,
>> >
>> > I have read posts and blogs on how elasticsearch compression can be
>> enabled in the previous versions(0.17 - 0.19).
>> >
>> > I am currently using ES 1.2.1, i wasn't able to find out how to enable
>> compression in this version or if at all there is any such option for it.
>> >
>> > I know that i can reduce the storage amount by disabling the source
>> using the mapping api, but what i was interested is the compression of data
>> storage.
>> >
>> > Thanks and Regards
>> > Sri
>>
>>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/ea1e6264-9694-47b0-98d1-992c67bbb63d%40googlegroups.com
> 
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHi6Lj447vhx1eCsZ%3D7CcWf79pY%2B-b%2BauKbf5ggA1cpEg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: compresstion in ES 1.2.1

2014-06-08 Thread sri

Thanks a lot for the insight Patrick. 

I have a few more queries:

   - it is possible to disable the '_source' and '_all' fields by default 
   for all the indices that would be created later (possibility define in the 
   elasticsearch.yml file) 
   - what happens if my index is created and then i disable '_source' and 
   '_all' fields, would that effect the file size of the index, i.e., will the 
   fields be removed/disabled for only the documents that will be added after 
   the disabling the fields??

Thanks and Regards
Sri

On Sunday, June 8, 2014 2:48:16 PM UTC-4, Patrick Proniewski wrote:
>
> Hello, 
>
> I don't know how it's compressed but it appears that data is compressed up 
> to an amount of 4k. ie. it's useless to store data on a compressed (lz4) 
> filesystem if fs block size is 4k: 
>
> Filesystem SizeUsed   Avail Capacity  Mounted on 
> zdata/ES-lz4   1.1T1.9G1.1T 0%/zdata/ES-lz4 
> zdata/ES   1.1T1.9G1.1T 0%/zdata/ES 
>
> But if fs block size is greater (say 128k), filesystem compression is a 
> huge win: 
>
> Filesystem SizeUsed   Avail Capacity  Mounted on 
> zdata/ES-lz4   1.1T1.1G1.1T 0%   
>  /zdata/ES-lz4-> compressratio  1.73x 
> zdata/ES-gzip  1.1T901M1.1T 0%   
>  /zdata/ES-gzip-> compressratio  2.27x 
> zdata/ES   1.1T1.9G1.1T 0%/zdata/ES 
>
> Unfortunately, a filesystem block size greater than 4K is not optimal for 
> IO (unless you have a big amount of physical memory you can dedicate to 
> filesystem data cache, which would be redundant with ES cache). 
>
>
>
> On 08 juin 2014, at 18:41, David Pilato wrote: 
>
> > It's compressed by default now. 
> > 
> > -- 
> > David ;-) 
> > Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs 
> > 
> > 
> > Le 8 juin 2014 à 18:01, sri <1.fr...@gmail.com > a écrit : 
> > 
> > Hello everyone, 
> > 
> > I have read posts and blogs on how elasticsearch compression can be 
> enabled in the previous versions(0.17 - 0.19). 
> > 
> > I am currently using ES 1.2.1, i wasn't able to find out how to enable 
> compression in this version or if at all there is any such option for it. 
> > 
> > I know that i can reduce the storage amount by disabling the source 
> using the mapping api, but what i was interested is the compression of data 
> storage. 
> > 
> > Thanks and Regards 
> > Sri 
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ea1e6264-9694-47b0-98d1-992c67bbb63d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: compresstion in ES 1.2.1

2014-06-08 Thread joergpra...@gmail.com

Lucene uses LZ4 compression

http://blog.jpountz.net/post/35667727458/stored-fields-compression-in-lucene-4-1

so you should not run ES on a ZFS file system with compression enabled.

Jörg



On Sun, Jun 8, 2014 at 8:47 PM, Patrick Proniewski  wrote:

> Hello,
>
> I don't know how it's compressed but it appears that data is compressed up
> to an amount of 4k. ie. it's useless to store data on a compressed (lz4)
> filesystem if fs block size is 4k:
>
> Filesystem SizeUsed   Avail Capacity  Mounted on
> zdata/ES-lz4   1.1T1.9G1.1T 0%/zdata/ES-lz4
> zdata/ES   1.1T1.9G1.1T 0%/zdata/ES
>
> But if fs block size is greater (say 128k), filesystem compression is a
> huge win:
>
> Filesystem SizeUsed   Avail Capacity  Mounted on
> zdata/ES-lz4   1.1T1.1G1.1T 0%/zdata/ES-lz4 ->
> compressratio  1.73x
> zdata/ES-gzip  1.1T901M1.1T 0%/zdata/ES-gzip->
> compressratio  2.27x
> zdata/ES   1.1T1.9G1.1T 0%/zdata/ES
>
> Unfortunately, a filesystem block size greater than 4K is not optimal for
> IO (unless you have a big amount of physical memory you can dedicate to
> filesystem data cache, which would be redundant with ES cache).
>
>
>
> On 08 juin 2014, at 18:41, David Pilato wrote:
>
> > It's compressed by default now.
> >
> > --
> > David ;-)
> > Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
> >
> >
> > Le 8 juin 2014 à 18:01, sri <1.fr@gmail.com> a écrit :
> >
> > Hello everyone,
> >
> > I have read posts and blogs on how elasticsearch compression can be
> enabled in the previous versions(0.17 - 0.19).
> >
> > I am currently using ES 1.2.1, i wasn't able to find out how to enable
> compression in this version or if at all there is any such option for it.
> >
> > I know that i can reduce the storage amount by disabling the source
> using the mapping api, but what i was interested is the compression of data
> storage.
> >
> > Thanks and Regards
> > Sri
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/74DEB7BF-4ED9-4E27-85E6-7775D9DD586E%40patpro.net
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoG%3DRbSDop-yA%3D7h8WnLu78OYAi-yfMYGnaqDyvVnxp1vw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Find all the geoshapes that insersects with a given latitude/longitude

2014-06-08 Thread Vidal Chriqui

Hi

My goal is to find for a given latitude/longitude all the indexed documents 
(circular geoshapes with specific radius for each) that contains this 
lat/lon.
If necessary i'm ok to transform the circular zones with envelope, but this 
does not seem to be the issue.

I need help to find the way to write the search query. 
In this gist, you can find the document mappings, 3 sample docs et 2 
attempts to write the query, but, those queries do not return anything 
whereas it should (obviously the query is not the correct one).

https://gist.github.com/anonymous/3e6aa70bf8b31e8eb345

Thanks for your help to write the correct query.

Best regards
Vidal

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/2ba3bd12-8f75-4584-9d89-a90fc8af0f53%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: compresstion in ES 1.2.1

2014-06-08 Thread Patrick Proniewski

Hello,

I don't know how it's compressed but it appears that data is compressed up to 
an amount of 4k. ie. it's useless to store data on a compressed (lz4) 
filesystem if fs block size is 4k:

Filesystem SizeUsed   Avail Capacity  Mounted on
zdata/ES-lz4   1.1T1.9G1.1T 0%/zdata/ES-lz4
zdata/ES   1.1T1.9G1.1T 0%/zdata/ES

But if fs block size is greater (say 128k), filesystem compression is a huge 
win:

Filesystem SizeUsed   Avail Capacity  Mounted on
zdata/ES-lz4   1.1T1.1G1.1T 0%/zdata/ES-lz4 -> 
compressratio  1.73x
zdata/ES-gzip  1.1T901M1.1T 0%/zdata/ES-gzip-> 
compressratio  2.27x
zdata/ES   1.1T1.9G1.1T 0%/zdata/ES

Unfortunately, a filesystem block size greater than 4K is not optimal for IO 
(unless you have a big amount of physical memory you can dedicate to filesystem 
data cache, which would be redundant with ES cache).



On 08 juin 2014, at 18:41, David Pilato wrote:

> It's compressed by default now.
> 
> --
> David ;-)
> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
> 
> 
> Le 8 juin 2014 à 18:01, sri <1.fr@gmail.com> a écrit :
> 
> Hello everyone, 
> 
> I have read posts and blogs on how elasticsearch compression can be enabled 
> in the previous versions(0.17 - 0.19). 
> 
> I am currently using ES 1.2.1, i wasn't able to find out how to enable 
> compression in this version or if at all there is any such option for it.
> 
> I know that i can reduce the storage amount by disabling the source using the 
> mapping api, but what i was interested is the compression of data storage.
> 
> Thanks and Regards
> Sri

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/74DEB7BF-4ED9-4E27-85E6-7775D9DD586E%40patpro.net.
For more options, visit https://groups.google.com/d/optout.

Re: Nested object type and join:false and geo_shape

2014-06-08 Thread bants

Hey - no, sorry I didn't ever get a response.

Picking up this work over the next couple of times so I'll see if I come up 
with anything.

On Thursday, May 29, 2014 2:16:07 PM UTC+1, horse.bad...@gmail.com wrote:
>
> Hello,
>
> I am trying to achieve something very similar, where a nested filter is 
> applied on a nested document and I only want the sub doc and not the root 
> in the hit source.  Setting the filter's 'join' to false always returns 0 
> hits.  Did you find a resolution?
>
> Thank you!
>
> On Monday, February 10, 2014 3:09:48 PM UTC-5, bants wrote:
>>
>> Hi All, 
>>
>> I would like to be able to search against documents using a geo_shape 
>> filter where the geojson is a nested subdocument and only retrieve the 1 
>> sub document that matched the geographical filter (not the whole document). 
>> I think the docs (specifically nested object type and nested query/filter 
>> docs) state this is possible using join:false. For some reason I can't get 
>> it to work though and I'm convinced its a user error or lack of 
>> understanding. 
>>
>> On ES 90.5 and below is a worked example.
>>
>> Can someone point me in the right direction please?
>>
>> Thanks
>>
>> # Clear the deck and create a new index
>>
>> > curl -XDELETE http://localhost:9200/test
>>
>> {"ok":true,"acknowledged":true}
>>
>>
>> > curl -XPUT  http://localhost:9200/test
>>
>> {"ok":true,"acknowledged":true}
>>
>>
>> # Set a new mapping for the testtype
>>
>> > curl -XPUT http://localhost:9200/test/testtype/_mapping -d '{"testtype": 
>> {"properties": {"entities": {"type": "nested", "properties": {"geometry": 
>> {"tree": "quadtree", "type": "geo_shape","precision": "10m"}}'
>> {"ok":true,"acknowledged":true}
>>
>> # Index a new document
>>
>> curl -XPUT http://localhost:9200/test/testtype/doc1 -d '{"id" : "doc1", 
>> "entities": [{"geometry": {"type" : "Point", "coordinates": [0.0, 0.0]}}, 
>> {"geometry": {"type" : "Point", "coordinates": [180.0, 90.0]}}]}'
>>
>> # Query WITH join:false
>>
>> > curl -XGET http://localhost:9200/test/testtype/_search -d '{"query": 
>> {"filtered": {"filter": {"nested" : {"path" : "entities", *"join":false*, 
>> "filter" : {"geo_shape": {"entities.geometry": {"shape": {"type": 
>> "envelope","coordinates": [[-10.0, 10.0],[10.0, -10.0]]}},"query": 
>> {"match_all": {}'
>>
>>
>> {"took":0,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":0,"max_score":null,"hits":[]}}
>>
>> # Query WITHour join:false
>>
>> > curl -XGET http://localhost:9200/test/testtype/_search -d '{"query": 
>> {"filtered": {"filter": {"nested" : {"path" : "entities", "filter" : 
>> {"geo_shape": {"entities.geometry": {"shape": {"type": 
>> "envelope","coordinates": [[-10.0, 10.0],[10.0, -10.0]]}},"query": 
>> {"match_all": {}'
>>
>> {"took":2,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":1,"max_score":1.0,"hits":[{"_index":"test","_type":"testtype","_id":"doc1","_score":1.0,
>>  
>> "_source" : {"id" : "doc1", "entities": [{"geometry": {"type" : "Point", 
>> "coordinates": [0.0, 0.0]}}, {"geometry": {"type" : "Point", "coordinates": 
>> [180.0, 90.0]}}]}}]}}
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1ab3c47b-32c8-4905-b524-09e9692c8327%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

options for accessing ES repository from traditional BI tools that do not aupport REST API

2014-06-08 Thread elitem way

Since ES support aggregation query now, I am thinking using ES as a data 
warehouse staging area. The challenge is pulling the summary data from ES 
using traditional BI tool like Tableau. I know Hive ODBC driver - ES-Hadoop 
Hive is an option, but it is very slow compared with native REST API. Are 
there other options? Does ES have plan to support SQL query interface 
natively? Thanks.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/02a67508-5f3d-4e40-9f98-6df6eddd4a8c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: issue of elasticsearch-hadoop-2.0.0 with Hive (cloudera and hortonworks), helps are needed

2014-06-08 Thread elitem way

Here is the Hive log when running the "select count(*) from cars2;":


application_1402243729361_0009
14/06/08 10:27:19 INFO log.PerfLogger: 
14/06/08 10:27:19 INFO log.PerfLogger: 
14/06/08 10:27:19 INFO parse.ParseDriver: Parsing command: select count(*) 
from cars2
14/06/08 10:27:19 INFO parse.ParseDriver: Parse Completed
14/06/08 10:27:19 INFO log.PerfLogger: 
14/06/08 10:27:19 INFO log.PerfLogger: 
14/06/08 10:27:19 INFO parse.SemanticAnalyzer: Starting Semantic Analysis
14/06/08 10:27:19 INFO parse.SemanticAnalyzer: Completed phase 1 of 
Semantic Analysis
14/06/08 10:27:19 INFO parse.SemanticAnalyzer: Get metadata for source 
tables
14/06/08 10:27:19 INFO parse.SemanticAnalyzer: Get metadata for subqueries
14/06/08 10:27:19 INFO parse.SemanticAnalyzer: Get metadata for destination 
tables
14/06/08 10:27:19 INFO ql.Context: New scratch dir is 
hdfs://localhost.localdomain:8020/tmp/hive-hive/hive_2014-06-08_10-27-19_935_2002159029513445063-1
14/06/08 10:27:19 INFO parse.SemanticAnalyzer: Completed getting MetaData 
in Semantic Analysis
14/06/08 10:27:19 INFO ppd.OpProcFactory: Processing for FS(18)
14/06/08 10:27:19 INFO ppd.OpProcFactory: Processing for SEL(17)
14/06/08 10:27:19 INFO ppd.OpProcFactory: Processing for GBY(16)
14/06/08 10:27:19 INFO ppd.OpProcFactory: Processing for RS(15)
14/06/08 10:27:19 INFO ppd.OpProcFactory: Processing for GBY(14)
14/06/08 10:27:19 INFO ppd.OpProcFactory: Processing for SEL(13)
14/06/08 10:27:19 INFO ppd.OpProcFactory: Processing for TS(12)
14/06/08 10:27:19 INFO log.PerfLogger: 
14/06/08 10:27:19 INFO log.PerfLogger: 
14/06/08 10:27:19 INFO physical.MetadataOnlyOptimizer: Looking for table 
scans where optimization is applicable
14/06/08 10:27:19 INFO physical.MetadataOnlyOptimizer: Found 0 metadata 
only table scans
14/06/08 10:27:19 INFO parse.SemanticAnalyzer: Completed plan generation
14/06/08 10:27:19 INFO ql.Driver: Semantic Analysis Completed
14/06/08 10:27:19 INFO log.PerfLogger: 
14/06/08 10:27:19 INFO exec.ListSinkOperator: Initializing Self 19 OP
14/06/08 10:27:19 INFO exec.ListSinkOperator: Operator 19 OP initialized
14/06/08 10:27:19 INFO exec.ListSinkOperator: Initialization Done 19 OP
14/06/08 10:27:19 INFO ql.Driver: Returning Hive schema: 
Schema(fieldSchemas:[FieldSchema(name:_c0, type:bigint, comment:null)], 
properties:null)
14/06/08 10:27:19 INFO log.PerfLogger: 
14/06/08 10:27:19 INFO Configuration.deprecation: 
mapred.input.dir.recursive is deprecated. Instead, use 
mapreduce.input.fileinputformat.input.dir.recursive
14/06/08 10:27:19 INFO log.PerfLogger: 
14/06/08 10:27:19 INFO log.PerfLogger: 
14/06/08 10:27:19 INFO ql.Driver: Creating lock manager of type 
org.apache.hadoop.hive.ql.lockmgr.zookeeper.ZooKeeperHiveLockManager
14/06/08 10:27:19 INFO zookeeper.ZooKeeper: Initiating client connection, 
connectString=localhost.localdomain:2181 sessionTimeout=60 
watcher=org.apache.hadoop.hive.ql.lockmgr.zookeeper.ZooKeeperHiveLockManager$DummyWatcher@d699a84
14/06/08 10:27:19 INFO log.PerfLogger: 
14/06/08 10:27:20 INFO log.PerfLogger: 
14/06/08 10:27:20 INFO log.PerfLogger: 
14/06/08 10:27:20 INFO ql.Driver: Starting command: select count(*) from 
cars2
14/06/08 10:27:20 INFO ql.Driver: Total MapReduce jobs = 1
14/06/08 10:27:20 INFO log.PerfLogger: 
14/06/08 10:27:20 INFO log.PerfLogger: 
14/06/08 10:27:20 INFO log.PerfLogger: 
14/06/08 10:27:20 INFO ql.Driver: Launching Job 1 out of 1
14/06/08 10:27:20 INFO exec.Task: Number of reduce tasks determined at 
compile time: 1
14/06/08 10:27:20 INFO exec.Task: In order to change the average load for a 
reducer (in bytes):
14/06/08 10:27:20 INFO exec.Task:   set 
hive.exec.reducers.bytes.per.reducer=
14/06/08 10:27:20 INFO exec.Task: In order to limit the maximum number of 
reducers:
14/06/08 10:27:20 INFO exec.Task:   set hive.exec.reducers.max=
14/06/08 10:27:20 INFO exec.Task: In order to set a constant number of 
reducers:
14/06/08 10:27:20 INFO exec.Task:   set mapred.reduce.tasks=
14/06/08 10:27:20 INFO ql.Context: New scratch dir is 
hdfs://localhost.localdomain:8020/tmp/hive-hive/hive_2014-06-08_10-27-19_935_2002159029513445063-3
14/06/08 10:27:20 INFO Configuration.deprecation: 
mapred.reduce.tasks.speculative.execution is deprecated. Instead, use 
mapreduce.reduce.speculative
14/06/08 10:27:20 INFO mr.ExecDriver: Using 
org.apache.hadoop.hive.ql.io.CombineHiveInputFormat
14/06/08 10:27:20 INFO mr.ExecDriver: adding libjars: 
file:///opt/cloudera/parcels/CDH-5.0.0-1.cdh5.0.0.p0.47/lib/hive/lib/hive-hbase-handler-0.12.0-cdh5.0.0.jar,file:///opt/cloudera/parcels/CDH-5.0.0-1.cdh5.0.0.p0.47/lib/hbase/hbase-protocol.jar,file:///opt/cloudera/parcels/CDH-5.0.0-1.cdh5.0.0.p0.47/lib/hbase/hbase-client.jar,file:///opt/cloudera/parcels/CDH-5.0.0-1.cdh5.0.0.p0.47/lib/hbase/lib/htrace-core.jar,file:///opt/cloudera/parcels/CDH-5.0.0-1.cdh5.0.0.p0.47/lib/hbase/lib/htrace-core-2.01.jar,file:///opt/cloudera/parcels/CDH-5.0.0-1.cdh5.0.0.p0.47/lib/hbase/hbase-hadoop-compat.jar,file:///opt/cloudera/p

Re: compresstion in ES 1.2.1

2014-06-08 Thread joergpra...@gmail.com

The Elasticsearch file size does not only contain compressed fields, but
much more. For example, term vectors, norms, etc. You would have to disable
field attributes you do not want. Also note, Elasticsearch has replica
enabled by default, and segment count is not optimized automatically.

Jörg


On Sun, Jun 8, 2014 at 7:09 PM, sri <1.fr@gmail.com> wrote:

> Okay i will make the changes and upload the new stats.
>
> I am just curious, could you explain how the results were making sense, i
> just want to get a proper idea of what ES is actually doing to the data.
>
> Thanks and Regards
> Sri
>
> On Sunday, June 8, 2014 12:56:55 PM UTC-4, David Pilato wrote:
>
>> Well. Think that you index all field individualy, that you are storing
>> source (compressed) and that you are indexing _all field as well.
>>
>> So with defaults, this results make sense to me.
>>
>> Try disable _all field and see what gain you can get.
>>
>> --
>> David ;-)
>> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
>>
>>
>> Le 8 juin 2014 à 18:50, sri <1.fr...@gmail.com> a écrit :
>>
>> Hi David,
>>
>> Thank you very much for the prompt reply.
>>
>> Below are the stats that i got when i was testing the ES cluster:
>>
>> Number of Nodes :2
>> Input format : rsyslog
>>
>>   input file size(Mb) ES file size per node(Mb)  1 1.8  2 3.6  3 5.3  4
>> 6.8  5 8.5  6 10.1  7 11.7  8 13  9 14.1  10 16
>> I am sorry to ask like this, but i wasn't understanding how the
>> compression was taking place.
>>
>> Thanks and Regards
>> Sri
>>
>> On Sunday, June 8, 2014 12:41:35 PM UTC-4, David Pilato wrote:
>>>
>>> It's compressed by default now.
>>>
>>> --
>>> David ;-)
>>> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
>>>
>>>
>>> Le 8 juin 2014 à 18:01, sri <1.fr...@gmail.com> a écrit :
>>>
>>> Hello everyone,
>>>
>>> I have read posts and blogs on how elasticsearch compression can be
>>> enabled in the previous versions(0.17 - 0.19).
>>>
>>> I am currently using ES 1.2.1, i wasn't able to find out how to enable
>>> compression in this version or if at all there is any such option for it.
>>>
>>> I know that i can reduce the storage amount by disabling the source
>>> using the mapping api
>>> ,
>>> but what i was interested is the compression of data storage.
>>>
>>> Thanks and Regards
>>> Sri
>>>
>>>  --
>>> You received this message because you are subscribed to the Google
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to elasticsearc...@googlegroups.com.
>>> To view this discussion on the web visit https://groups.google.com/d/
>>> msgid/elasticsearch/209b1832-6924-4794-833e-489917962211%
>>> 40googlegroups.com
>>> 
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>  --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to elasticsearc...@googlegroups.com.
>> To view this discussion on the web visit https://groups.google.com/d/
>> msgid/elasticsearch/2e95acf2-1658-40ff-adfe-2be2e2031add%
>> 40googlegroups.com
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/b2cac83a-777a-4876-bf07-5cf093a92c1c%40googlegroups.com
> 
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoG5TAS08k2Wtqe647reMKHUkNkvyepfnp7Sz7u9YqyDag%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: compresstion in ES 1.2.1

2014-06-08 Thread sri

Okay i will make the changes and upload the new stats.

I am just curious, could you explain how the results were making sense, i 
just want to get a proper idea of what ES is actually doing to the data.

Thanks and Regards
Sri

On Sunday, June 8, 2014 12:56:55 PM UTC-4, David Pilato wrote:
>
> Well. Think that you index all field individualy, that you are storing 
> source (compressed) and that you are indexing _all field as well.
>
> So with defaults, this results make sense to me.
>
> Try disable _all field and see what gain you can get.
>
> --
> David ;-)
> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
>
>
> Le 8 juin 2014 à 18:50, sri <1.fr...@gmail.com > a écrit :
>
> Hi David, 
>
> Thank you very much for the prompt reply.
>
> Below are the stats that i got when i was testing the ES cluster:
>
> Number of Nodes :2 
> Input format : rsyslog
>
>   input file size(Mb) ES file size per node(Mb)  1 1.8  2 3.6  3 5.3  4 
> 6.8  5 8.5  6 10.1  7 11.7  8 13  9 14.1  10 16 
> I am sorry to ask like this, but i wasn't understanding how the 
> compression was taking place.
>
> Thanks and Regards
> Sri
>
> On Sunday, June 8, 2014 12:41:35 PM UTC-4, David Pilato wrote:
>>
>> It's compressed by default now.
>>
>> --
>> David ;-)
>> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
>>
>>
>> Le 8 juin 2014 à 18:01, sri <1.fr...@gmail.com> a écrit :
>>
>> Hello everyone, 
>>
>> I have read posts and blogs on how elasticsearch compression can be 
>> enabled in the previous versions(0.17 - 0.19). 
>>
>> I am currently using ES 1.2.1, i wasn't able to find out how to enable 
>> compression in this version or if at all there is any such option for it.
>>
>> I know that i can reduce the storage amount by disabling the source using 
>> the mapping api 
>> ,
>>  
>> but what i was interested is the compression of data storage.
>>
>> Thanks and Regards
>> Sri
>>
>>  -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/209b1832-6924-4794-833e-489917962211%40googlegroups.com
>>  
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>  -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearc...@googlegroups.com .
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/2e95acf2-1658-40ff-adfe-2be2e2031add%40googlegroups.com
>  
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/b2cac83a-777a-4876-bf07-5cf093a92c1c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: compresstion in ES 1.2.1

2014-06-08 Thread joergpra...@gmail.com

Compression is always enabled by default.

Jörg


On Sun, Jun 8, 2014 at 6:01 PM, sri <1.fr@gmail.com> wrote:

> Hello everyone,
>
> I have read posts and blogs on how elasticsearch compression can be
> enabled in the previous versions(0.17 - 0.19).
>
> I am currently using ES 1.2.1, i wasn't able to find out how to enable
> compression in this version or if at all there is any such option for it.
>
> I know that i can reduce the storage amount by disabling the source using
> the mapping api
> ,
> but what i was interested is the compression of data storage.
>
> Thanks and Regards
> Sri
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/209b1832-6924-4794-833e-489917962211%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoEZg-qEbYeRER8%3D4RY75ExPo1fVaU_ZM1v3SKmSkG2cHQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

issue of elasticsearch-hadoop-2.0.0 with Hive (cloudera and hortonworks), helps are needed

2014-06-08 Thread elitem way

I am learning the elasticsearch-hadoop. I have a few issues that I do not 
understand. I am using ES 1.12 on Windows, elasticsearch-hadoop-2.0.0 and 
cloudera-quickstart-vm-5.0.0-0-vmware sandbox with Hive.

1. I loaded only 6 rows to ES index car/transactions. Why did Hive return 
14 rows instead? See below.
2. "select count(*) from cars2" failed with code 2. "Group by", "sum" also 
failed. Did I miss anything. The similar query are successful when using 
sample_07 and sample_08 tables that come with Hive.
3.  elasticsearch-hadoop-2.0.0 does seem to work with jetty - the 
authentication plugin. I got errors when I enable jetty and set 'es.nodes' 
= 'superuser:admin@192.168.128.1'
4. I could not pipe data from Hive to ElasticSearch either.

*--ISSUE 1*:
--load data to ES
 POST: http://localhost:9200/cars/transactions/_bulk
{ "index": {}}
{ "price" : 3, "color" : "green", "make" : "ford", "sold" : 
"2014-05-18" }
{ "index": {}}
{ "price" : 15000, "color" : "blue", "make" : "toyota", "sold" : 
"2014-07-02" }
{ "index": {}}
{ "price" : 12000, "color" : "green", "make" : "toyota", "sold" : 
"2014-08-19" }
{ "index": {}}
{ "price" : 2, "color" : "red", "make" : "honda", "sold" : "2014-11-05" 
}
{ "index": {}}
{ "price" : 8, "color" : "red", "make" : "bmw", "sold" : "2014-01-01" }
{ "index": {}}
{ "price" : 25000, "color" : "blue", "make" : "ford", "sold" : "2014-02-12" 
}

CREATE EXTERNAL TABLE cars2 (color STRING, make STRING, price BIGINT, sold 
TIMESTAMP)
STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler'
TBLPROPERTIES('es.resource' = 'cars/transactions',
'es.nodes' = '192.168.128.1', 'es.port'='9200');

HIVE: select * from cars2;
14 rows returned.

  color make price sold
0 red honda 2 2014-11-05 00:00:00.0
1 red honda 1 2014-10-28 00:00:00.0
2 green ford 3 2014-05-18 00:00:00.0
3 green toyota 12000 2014-08-19 00:00:00.0
4 blue ford 25000 2014-02-12 00:00:00.0
5 blue toyota 15000 2014-07-02 00:00:00.0
6 red bmw 8 2014-01-01 00:00:00.0
7 red honda 1 2014-10-28 00:00:00.0
8 blue toyota 15000 2014-07-02 00:00:00.0
9 red honda 2 2014-11-05 00:00:00.0
10 green ford 3 2014-05-18 00:00:00.0
11 green toyota 12000 2014-08-19 00:00:00.0
12 red honda 2 2014-11-05 00:00:00.0
13 red honda 2 2014-11-05 00:00:00.0
14 red bmw 8 2014-01-01 00:00:00.0


*ISSUE2:*

HIVE: select count(*) from cars2;

Your query has the following error(s):
Error while processing statement: FAILED: Execution Error, return code 2 
from org.apache.hadoop.hive.ql.exec.mr.MapRedTask


*--ISSUE 4:*

CREATE EXTERNAL TABLE test1 (
description STRING)
STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler'
TBLPROPERTIES('es.host' = '192.168.128.1', 'es.port'='9200', 'es.resource' 
= 'test1');

INSERT OVERWRITE TABLE test1 select description from sample_07;

Your query has the following error(s):

Error while processing statement: FAILED: Execution Error, return code 2 
from org.apache.hadoop.hive.ql.exec.mr.MapRedTask

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/8c642665-424a-48be-bc5d-8625b94243c0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: compresstion in ES 1.2.1

2014-06-08 Thread David Pilato

Well. Think that you index all field individualy, that you are storing source 
(compressed) and that you are indexing _all field as well.

So with defaults, this results make sense to me.

Try disable _all field and see what gain you can get.

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs


Le 8 juin 2014 à 18:50, sri <1.fr@gmail.com> a écrit :

Hi David, 

Thank you very much for the prompt reply.

Below are the stats that i got when i was testing the ES cluster:

Number of Nodes :2 
Input format : rsyslog

input file size(Mb) ES file size per node(Mb)
1   1.8
2   3.6
3   5.3
4   6.8
5   8.5
6   10.1
7   11.7
8   13
9   14.1
10  16

I am sorry to ask like this, but i wasn't understanding how the compression was 
taking place.

Thanks and Regards
Sri

> On Sunday, June 8, 2014 12:41:35 PM UTC-4, David Pilato wrote:
> It's compressed by default now.
> 
> --
> David ;-)
> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
> 
> 
> Le 8 juin 2014 à 18:01, sri <1.fr...@gmail.com> a écrit :
> 
> Hello everyone, 
> 
> I have read posts and blogs on how elasticsearch compression can be enabled 
> in the previous versions(0.17 - 0.19). 
> 
> I am currently using ES 1.2.1, i wasn't able to find out how to enable 
> compression in this version or if at all there is any such option for it.
> 
> I know that i can reduce the storage amount by disabling the source using the 
> mapping api, but what i was interested is the compression of data storage.
> 
> Thanks and Regards
> Sri
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearc...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/209b1832-6924-4794-833e-489917962211%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/2e95acf2-1658-40ff-adfe-2be2e2031add%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/E4936BA4-307A-4B3C-A41D-B6889C0A5ECA%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.

Re: compresstion in ES 1.2.1

2014-06-08 Thread sri

Hi David, 

Thank you very much for the prompt reply.

Below are the stats that i got when i was testing the ES cluster:

Number of Nodes :2 
Input format : rsyslog

  input file size(Mb) ES file size per node(Mb)  1 1.8  2 3.6  3 5.3  4 6.8  
5 8.5  6 10.1  7 11.7  8 13  9 14.1  10 16 
I am sorry to ask like this, but i wasn't understanding how the compression 
was taking place.

Thanks and Regards
Sri

On Sunday, June 8, 2014 12:41:35 PM UTC-4, David Pilato wrote:
>
> It's compressed by default now.
>
> --
> David ;-)
> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
>
>
> Le 8 juin 2014 à 18:01, sri <1.fr...@gmail.com > a écrit :
>
> Hello everyone, 
>
> I have read posts and blogs on how elasticsearch compression can be 
> enabled in the previous versions(0.17 - 0.19). 
>
> I am currently using ES 1.2.1, i wasn't able to find out how to enable 
> compression in this version or if at all there is any such option for it.
>
> I know that i can reduce the storage amount by disabling the source using 
> the mapping api 
> ,
>  
> but what i was interested is the compression of data storage.
>
> Thanks and Regards
> Sri
>
>  -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearc...@googlegroups.com .
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/209b1832-6924-4794-833e-489917962211%40googlegroups.com
>  
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/2e95acf2-1658-40ff-adfe-2be2e2031add%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: compresstion in ES 1.2.1

2014-06-08 Thread David Pilato

It's compressed by default now.

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs


Le 8 juin 2014 à 18:01, sri <1.fr@gmail.com> a écrit :

Hello everyone, 

I have read posts and blogs on how elasticsearch compression can be enabled in 
the previous versions(0.17 - 0.19). 

I am currently using ES 1.2.1, i wasn't able to find out how to enable 
compression in this version or if at all there is any such option for it.

I know that i can reduce the storage amount by disabling the source using the 
mapping api, but what i was interested is the compression of data storage.

Thanks and Regards
Sri

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/209b1832-6924-4794-833e-489917962211%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/06202D90-95A9-4998-AC18-7ECFC38CE336%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.

compresstion in ES 1.2.1

2014-06-08 Thread sri

Hello everyone, 

I have read posts and blogs on how elasticsearch compression can be enabled 
in the previous versions(0.17 - 0.19). 

I am currently using ES 1.2.1, i wasn't able to find out how to enable 
compression in this version or if at all there is any such option for it.

I know that i can reduce the storage amount by disabling the source using 
the mapping api 
,
 
but what i was interested is the compression of data storage.

Thanks and Regards
Sri

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/209b1832-6924-4794-833e-489917962211%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: XGET to _mapping does not match the mapping I used to create the index, is this normal?

2014-06-08 Thread Jun Ohtani

Hi,

You’re welcome.

Btw Twitterとかで聞いてもらっても答えますので、お気軽に。もちろん、MLのほうが皆さんに見えるのでいいのですが。


Jun Ohtani
joht...@gmail.com
blog : http://blog.johtani.info
twitter : http://twitter.com/johtani

2014/06/07 0:17、Enno Shioji  のメール：

> Hi Jun,
> 
> Ah, I must be doing something wrong then. I'll correct the JSON and test 
> again.
> 
> Btw わざわざテストしていただきありがとうございます！
> 
> 
> 
> On Friday, 6 June 2014 15:29:37 UTC+1, Jun Ohtani wrote:
> Hi, 
> 
> How do you use API to create index? 
> 
> I think “dynamic” property is wrong place. 
> 
> I try to create index to use the following JSON and index; 
> 
> curl -XPOST localhost:9200/myidx -d ' 
> { 
>   "settings": { 
> "index.refresh_interval": "5m" 
>   }, 
>   "mappings": { 
> "message": { 
>   "dynamic": "strict", 
>   "_ttl": { 
> "enabled": true 
>   }, 
>   "properties": { 
> "my_nested_thing": { 
>   "type": "nested", 
>   "properties": { 
> "some_id": { 
>   "type": "string", 
>   "index": "not_analyzed" 
> }, 
> "count": { 
>   "type": "long" 
> } 
>   } 
> } 
>   } 
> } 
>   } 
> }' 
> 
> curl -XGET localhost:9200/myidx/_mapping?pretty 
> 
> { 
>   "myidx" : { 
> "mappings" : { 
>   "message" : { 
> "dynamic" : "strict", 
> "_ttl" : { 
>   "enabled" : true 
> }, 
> "properties" : { 
>   "my_nested_thing" : { 
> "type" : "nested", 
> "properties" : { 
>   "count" : { 
> "type" : "long" 
>   }, 
>   "some_id" : { 
> "type" : "string", 
> "index" : "not_analyzed" 
>   } 
> } 
>   } 
> } 
>   } 
> } 
>   } 
> } 
> 
> Does it make sense? 
> 
>  
> Jun Ohtani 
> joh...@gmail.com 
> blog : http://blog.johtani.info 
> twitter : http://twitter.com/johtani 
> 
> 2014/06/06 22:31、Enno Shioji  のメール： 
> 
> > Hi, I created my index using this mapping JSON: 
> > 
> > { 
> > "myidx": { 
> > "index.refresh_interval":"5m", 
> > "mappings": { 
> > "dynamic": "strict", 
> > "message": { 
> > "_ttl": { 
> > "enabled": true 
> > }, 
> > "properties": { 
> > "my_nested_thing": { 
> > "type": "nested", 
> > "properties": { 
> > "some_id": { 
> > "type": "string", "index": "not_analyzed" 
> > }, 
> > "count": { 
> > "type": "long" 
> > }, 
> > } 
> > } 
> > } 
> > } 
> > } 
> > } 
> > } 
> > 
> > If I do a GET to _mapping after indexing some documents, it will 
> > essentially return: 
> > 
> > { 
> > "message": { 
> > "properties": { 
> > "my_nested_thing": { 
> > "properties": { 
> > "some_id": { 
> > "type": "string", "index": "not_analyzed" 
> > }, 
> > "count": { 
> > "type": "long" 
> > }, 
> > } 
> > } 
> > } 
> > } 
> > } 
> > 
> > I.e. "_ttl": enabled=true and the "type": "nested" is not present from the 
> > mapping. I also noticed that it allows auto update to the mapping despite 
> > the "dynamic": "strict" instruction. 
> > 
> > Does this mean these instructions are somehow not being reflected? If so, 
> > what am I doing wrong? 
> > 
> > I'm using version 1.2.1 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > -- 
> > You received this message because you are subscribed to the Google Groups 
> > "elasticsearch" group. 
> > To unsubscribe from this group and stop receiving emails from it, send an 
> > email to elasticsearc...@googlegroups.com. 
> > To view this discussion on the web visit 
> > https://groups.google.com/d/msgid/elasticsearch/b57199bf-c01f-4b86-9d8d-a4acfb06618c%40googlegroups.com.
> >  
> > For more options, visit https://groups.google.com/d/optout. 
> 
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/7aeb3e33-4ad0-41d5-a27f-4e21b326e78d%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.



signature.asc

Tribe problem in creating native thread

2014-06-08 Thread Srećko Morović

Hello,

We are trying to start up a tribe node that would connect to ~60 clusters
all consisting of several nodes.
We use unicast from the tribe server to discover all of the cluster master
nodes. Master nodes are fixed (if a master fails, then other machines in
the cluster will be out of function anyway).

The connection step takes a while, where servers appears to open a large
number of threads (~1000 AFAICT, by looking at /proc info) before failing
with this:

[2014-06-06 18:27:51,122][ERROR][bootstrap] [Ghaur]
{1.2.0}: Startup Failed ...
- ElasticsearchException[unable to create new native thread]
OutOfMemoryError[unable to create new native thread]

We attempted to change the java -Xss parameter to adjust the stack size,
however it didn't have any effect.

We understood from the code that it creates a thread to wait on a
connection to each node in each cluster. In our case this can go up to
several hundred machines (although they aren't yet all installed).
If the tribe works by querying each individual node, then it might not
scale for our system. Instead we were hoping that it can work by proxying
queries through the master node. Is something alike possible?

Any help or advice is appreciated.

Thanks,
Srecko

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/0f14795f-f13d-4101-9726-0d573e3cba1c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Understanding merge statistics from Marvel

2014-06-08 Thread John Smith

I know benchmarking is a tough subject! But what do those number mean?

On Friday, 6 June 2014 12:17:22 UTC-4, John Smith wrote:
>
> Running Elasticsearch 1.2.1 with Java 1.7_55 on CentOs 6.5
>
> The machine is a 32 core 96GB with standard spinning disk, but I also 
> installed 1 Samsung Evo 840 for testing ES.
> The Evo is rated at 500MB/s though the Linux perf test reported about 
> 300MB/s read and about 250MB/s write. The board is SataII which explains 
> why it's 300MB/s max.
>
> Using Jmeter to send index requests to ES
>
> Executing about 6200 puts/s
>
> Marvel reports 
> 2200 IOPS/
> 20MB merges/s
>
> And iostat for the drive
>
> sdf   0.00 14214.000.00 2021.33 0.0062.3563.17 
>10.495.17   0.48  97.27
>
> Also seeing  on the console: stop throttling indexing: 
> numMergesInFlight=4, maxNumMerges=5
>
> Are these numbers good?
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c0106a80-2520-4ea2-8b7b-f587a87cd610%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: scala elastic4s usage question

2014-06-08 Thread Stephen Samuel

"as" is only used to start a mapping definition, ie in the outer block. You 
are mapping nested fields, in which case you want to use "nested", or 
"inner" depending on your use case.

Here is an example taken from the unit tests:

  create.index("users").shards(2).mappings(
  "tweets" as (
id typed StringType analyzer KeywordAnalyzer,
"name" typed StringType analyzer KeywordAnalyzer,
"locations" typed GeoPointType validate true normalize true,
"date" typed DateType precisionStep 5,
"size" typed LongType,
"read" typed BooleanType,
"content" typed StringType,
"user" nested (
  "name" typed StringType,
  "email" typed StringType,
  "last" nested {
"lastLogin" typed DateType
  }
)
  ) size true numericDetection true boostNullValue 1.2 boost "myboost"
)

On Friday, June 6, 2014 8:25:12 PM UTC+1, Ramdev Wudali wrote:
>
> Hi:
>   I have started using the scala libraries provided by the elastic4s 
> libraries. I am running into a problem creating a mapping that has a 
> straightforward definition (yet it has some complexity).
>
> Example :
>
> {
> "index": {
> "mappings": {
> "OA": {
> "properties": {
> "AdminStatus": {
> "properties": {
> "content": {
> "type": "string"
> },
> "effectiveFrom": {
> "type": "date",
> "format": "dateOptionalTime"
> }
> }
> },
> "IsPublicFlag": {
> "type": "boolean"
> },
> "OrganizationAddress": {
> "properties": {
> "OrganizationAddressCity": {
> "type": "string"
> },
> "OrganizationAddressCountryCode": {
> "type": "string"
> },
> "OrganizationAddressLine1": {
> "type": "string"
> }
> }
> }
> }
> }
> }
> }
> }
>
>
>
>
>
> I am not able to figure out how to define OrganizationAddress field 
>  (which is a "complex" object.)
>
> if I define it as such :
>
> indexClient.execute {
> create index "index" mappings (
>   "OA" as (
>   "AdminStatus" as (
>   "content" typed 
> StringType,
>   "effectiveFrom" 
> typed DateType
> ),
> "IsPublicFlag" 
> typed BooleanType,
> 
> "OrganizationAddress" as (
>   
> "OrganizationAddressCity" typed StringType,
>   
> "OrganizationAddressLine1" typed StringType,
> )
> ) }
>
> I get a compilation  error :
> Error:(52, 69) type mismatch;
>  found   : com.sksamuel.elastic4s.mapping.MappingDefinition
>  required: com.sksamuel.elastic4s.mapping.TypedFieldDefinition
>   "AdminStatus" as (
> ^
>
> How can I map complex objects using the Scala interface elastic4s ?
>
> Thanks
>
> Ramdev
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/f79f8ae4-666d-4c9f-9e14-c25a269f61e7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Adding relevance to query_string query, help required

2014-06-08 Thread see613

I am using a "query_string" 

 
with morphology and wildcard.

'searchAnalyzer' => array(
   'type' => 'custom',
   'tokenizer' => 'standard',
   'filter' => array('lowercase', 'word_delimiter', 'snowBallRus', 
'russian_morphology')
),

GET _search
{
  "explain": true,
  "query":{
  "query_string":{
"query":"*test*",
"analyzer":"searchAnalyzer",
"fields":["name^100","content^1"],
"analyze_wildcard":true
  }
},
"highlight":{
  "fields":{
"name":{
  "fragment_size":1,
  "number_of_fragments":1
},
"content":{
  "fragment_size":100,
  "number_of_fragments":5
}
  },
  "pre_tags":[""],
  "post_tags":[""]
}
}

I was surprised that the "query_string" uses a constant score type which i 
discovered using the query explanation.

"_explanation": {
   "value": 0.1,
   "description": "max of:",
   "details": [
  {
 "value": 0.1,
 "description": "ConstantScore(content:*test*), product 
of:",   <--  constant score
 "details": [
{
   "value": 1,
   "description": "boost"
},
{
   "value": 0.1,
   "description": "queryNorm"
}
 ]
  }
   ]
}

I want to use Wildcard, Morphology and score type "cross_fields" 

 
(like in the "multi_match" 

 
query), 
i.e. the sum of all keyword scores in all fields together.

Does anyone know how I can achieve this? Thanks in advance

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/b0d15f46-6d49-4831-972b-e1638a1c689d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Elasticsearch stemmer issue

restore snapshot not working

Re-creating ES Index

Re: Queries, filters and match_all

Losing data after Elasticsearch restart

Sort nested documents in search result

Re: compresstion in ES 1.2.1

Re: compresstion in ES 1.2.1

Re: compresstion in ES 1.2.1

Re: compresstion in ES 1.2.1

Find all the geoshapes that insersects with a given latitude/longitude

Re: compresstion in ES 1.2.1

Re: Nested object type and join:false and geo_shape

options for accessing ES repository from traditional BI tools that do not aupport REST API

Re: issue of elasticsearch-hadoop-2.0.0 with Hive (cloudera and hortonworks), helps are needed

Re: compresstion in ES 1.2.1

Re: compresstion in ES 1.2.1

Re: compresstion in ES 1.2.1

issue of elasticsearch-hadoop-2.0.0 with Hive (cloudera and hortonworks), helps are needed

Re: compresstion in ES 1.2.1

Re: compresstion in ES 1.2.1

Re: compresstion in ES 1.2.1

compresstion in ES 1.2.1

Re: XGET to _mapping does not match the mapping I used to create the index, is this normal?

Tribe problem in creating native thread

Re: Understanding merge statistics from Marvel

Re: scala elastic4s usage question

Adding relevance to query_string query, help required

28 matches

Site Navigation

Mail list logo

Footer information