ElasticSearch-yarn - how to install plug-ins

2015-01-16 Thread Dan Cieslak
Is it possible to install plug-ins to elasticsearch when running it under 
yarn? I can't quite seem to figure out how to invoke the plugin command via 
hadoop

Thanks
Dan

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/4680aff2-de90-4731-9216-becdfbee74c5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Is it possible to install plugin into a directory other than ${ES_HOME}/plugins?

2015-01-16 Thread Jinyuan Zhou
Thanks,

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/229c89bc-6a0a-4d85-a679-ebc83971cae3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Missing closed indexes after adding node

2015-01-16 Thread Mark Walkom
The data is there, it's just closed, and there are no actions taken on
closed indexes.
You need to reopen them -
http://www.elasticsearch.org/guide/en/elasticsearch/reference/master/indices-open-close.html#indices-open-close

On 17 January 2015 at 02:35, Russell Butturini  wrote:

> Hey guys,
>
> New to Elasticsearch but already a huge fan.  I had a strange incident
> happen I was hoping you could provide some insight on.
>
> I set up a single Elastic search node for a project and collected some
> data in it for a few days to make sure everything was working correctly. No
> issues.  I went through and closed those indexes from my testing, and added
> a second node to the cluster.  When I did that...POOFAll the closed
> indexes disappeared.  No big deal to me, but I can see the disk space is
> still being used by those indexes.  They didn't replicate to the second
> node (the two VMs are identical) because the disk space usage over there is
> much lower.  I don't care about the data, is there any way I can either
> recover the indexes and properly purge them, or just remove them from disk
> by some other method? I don't really care about the data, but would to get
> the space back.
>
> Thanks in advance for any help!
>
> -Russell
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/89a3ebcd-4dde-4d16-8458-2bb372b2870f%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X8-H89cVUyS2D%3DBRp4b9ADkSUiscypnhAdJPViNV_D67Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Resetting Node Statistics

2015-01-16 Thread Mark Walkom
Unfortunately there isn't.
Feel free to raise an enhancement request on github though, as it could be
useful for others :)

On 17 January 2015 at 03:52, Darren McDaniel  wrote:

> Short of restarting the node.. Is there any thought of giving a user the
> ability to reset the node statistics manually?
>
> For example,  we're running our cluster in an ESX environment, and have
> recently VMotion'd the data to all SSD.
>
> We'd like to be able to reset all the stats for all the nodes without
> restarting the cluster.
>
> Thanks!
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/d2711d4f-ed13-4f00-b926-69f8494cd996%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X95216JATFxaf3wMEus-9%2BS8mma957zbAfroPCp2rKHew%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Cached Elasticsearch Information

2015-01-16 Thread Mark Walkom
If you are just untarring and starting the ES binary, it'll use the
defaults, which is multicast and default cluster name.
So it'll do a search for any other nodes and if they have the same cluster
name and can reach each other on the network, they will try to form a
cluster, that is what you are seeing.

On 17 January 2015 at 09:15, Albion Baucom  wrote:

> No configuration management in place. I manually untarred the files. So
> all configuration information should be contained in the application
> directory?
>
> Here is what I see which is, even after a reboot, a Nightwatch named
> Elasticsearch instance looking for the phantom "Plantman" master, even
> though this node is configured to be a master only:
>
> [root@es-master elasticsearch-1.4.2]#
> [2015-01-16 11:57:43,500][INFO ][node ] [*Nightwatch*]
> version[1.4.2], pid[24139], build[927caff/2014-12-16T14:11:12Z]
> [2015-01-16 11:57:43,501][INFO ][node ] [Nightwatch]
> initializing ...
> [2015-01-16 11:57:43,505][INFO ][plugins  ] [Nightwatch]
> loaded [], sites []
> [2015-01-16 11:57:46,701][INFO ][node ] [Nightwatch]
> initialized
> [2015-01-16 11:57:46,701][INFO ][node ] [Nightwatch]
> starting ...
> [2015-01-16 11:57:46,903][INFO ][transport] [Nightwatch]
> bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/
> 10.0.0.45:9300]}
> [2015-01-16 11:57:46,919][INFO ][discovery] [Nightwatch]
> lsflogs/A6CISFuzRkK31rL2dFFdNA
> [2015-01-16 11:57:50,187][INFO ][discovery.zen] [Nightwatch] 
> *failed
> to send join request to master [[Plantman]*
> [VOmbX4yORHiObk7Q3D7tbQ][es-master][inet[/10.0.0.45:9300]]{data=false,
> master=true}], reason
> [RemoteTransportException[[Nightwatch][inet[/10.0.0.45:9300]][internal:discovery/zen/join]];
> nested: ElasticsearchIllegalStateException[Node
> [[Nightwatch][A6CISFuzRkK31rL2dFFdNA][es-master][inet[/10.0.0.45:9300]]{data=false,
> master=true}] not master for join request from
> [[Nightwatch][A6CISFuzRkK31rL2dFFdNA][es-master][inet[/10.0.0.45:9300]]{data=false,
> master=true}]]; ], tried [3] times
>
> My workaround was to rename the cluster. Perhaps there was another node
> with that information and it was confusing the master?
>
> But you answered my question which is there is no information written
> outside of the application directory that could cache settings for future
> clusters if I tear them down and rebuild them.
>
> Thanks
> Albion
>
>
> On Wednesday, January 14, 2015 at 6:42:23 PM UTC-8, Mark Walkom wrote:
>>
>> It doesn't cache this sort of info, it'll read what is in the config file.
>>
>> Are you using puppet/chef/other for config management perchance? These
>> could be over writing your config.
>>
>> On 15 January 2015 at 06:22, Albion Baucom  wrote:
>>
>>> I am new to ELK and I am still using a dev environment with real data to
>>> understand how the pipeline works.
>>>
>>> Recently I had nodes networking between them that were part of a 4 node
>>> cluster: 1 master node, no data and 3 data-only nodes. These were working
>>> fine up to the point that they lost connectivity between themselves. I am
>>> using a unicast cluster setup as multicast was not working on my OpenStack
>>> cluster (a question for another future post).
>>>
>>> When I rebooted the nodes and tried to bring the master back up it tried
>>> to join the previous master instance. Clearly there is information cached
>>> about the previous running instance that needs to be flushed. As this is a
>>> dev environment, I copied the config file and blew away the elastic search
>>> directory, copied the config back and tried to restart elasticsearch.
>>> Curiously it is still trying to join the old master, even though no other
>>> processes are running. Obviously this cached data lives somewhere else on
>>> the system and I have yet to find it.
>>>
>>> Perhaps someone can point me in the right direction here.
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to elasticsearc...@googlegroups.com.
>>> To view this discussion on the web visit https://groups.google.com/d/
>>> msgid/elasticsearch/90187018-dde9-447d-b211-d806fd46cec8%
>>> 40googlegroups.com
>>> 
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/12876abb-4c7c-4ef8-934f-d0afac8b8c3b%40googlegr

Re: Template is missing

2015-01-16 Thread Mark Walkom
 Can you put your complete call into a gist or similar for us to check?

On 17 January 2015 at 05:34, Gabriele Angeli  wrote:

> i guys, I try to put a new template in ES 1.3.6 but i always obtain the
> same result: {"error":"ActionRequestValidationException[Validation Failed:
> 1: template is missing;]","status":500}
> Someone know something about this error? Thanks in advance
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/bf3dee93-c142-4c55-90fb-afd727f8af98%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X_gNWSq-1XvySwKEDTrjfTkBXw%2BzCqNLtvG8BWQHSQfmw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Performance issues when sending documents to multiple indexes at the same time.

2015-01-16 Thread Mark Walkom
You've got too many replicas and shards. One shard per node (maybe 2) and
one replica is enough.

You should be using the bulk API as well.

What's your heap set to?

Also consider combining customers into one index, it'll reduce the work you
need to do.
 On 17/01/2015 4:07 am, "Nawaaz Soogund"  wrote:

> We are experiencing some performance issues or anomalies on a
> elasticsearch specifically on a system we are currently building.
>
>
>
> *The requirements:*
>
> We need to capture data for multiple of our customers,  who will query and
> report on them on a near real time basis. All the documents received are
> the same format with the same properties and are in a flat structure (all
> fields are of primary type and no nested objects). We want to keep each
> customer’s information separate from each other.
>
>
>
> *Frequency of data received and queried:*
>
> We receive data for each customer at a fluctuating rate of 200 to 700
> documents per second – with the peak being in the middle of the day.
>
> Queries will be mostly aggregations over around 12 million documents per
> customer – histogram/percentiles to show patterns over time and the
> occasional raw document retrieval to find out what happened a particular
> point in time. We are aiming to serve 50 to 100 customer at varying rates
> of documents inserted – the smallest one could be 20 docs/sec to the
> largest one peaking at 1000 docs/sec for some minutes.
>
>
>
> *How are we storing the data:*
>
> Each customer has one index per day. For example, if we have 5 customers,
> there will be a total of 35 indexes for the whole week. The reason we break
> it per day is because it is mostly the latest two that get queried with
> occasionally the remaining others. We also do it that way so we can delete
> older indexes independently of customers (some may want to keep 7 days,
> some 14 days’ worth of data)
>
>
>
> *How we are inserting:*
>
> We are sending data in batches of 10 to 2000 – every second. One document
> is around 900bytes raw.
>
>
>
> *Environment*
>
> AWS C3-Large – 3 nodes
>
> All indexes are created with 10 shards with 2 replica for the test purposes
>
> Both Elasticsearch 1.3.2 and 1.4.1
>
>
>
> *What we have noticed:*
>
>  If I push data to one index only, Response time starts at 80 to 100ms
> for each batch inserted when the rate of insert is around 100 documents per
> second.  I ramp it up and I can reach 1600 before the rate of insert goes
> to close to 1sec per batch and when I increase it to close to 1700, it will
> hit a wall at some point because of concurrent insertions and the time will
> spiral to 4 or 5 seconds. Saying that, if I reduce the rate of inserts,
> Elasticsearch recovers nicely. CPU usage increases as rate increases.
>
>
>
> If I push to 2 indexes concurrently, I can reach a total of 1100 and CPU
> goes up to 93% around 900 documents per second.
>
> If I push to 3 indexes concurrently, I can reach a total of 150 and CPU
> goes up to 95 to 97%. I tried it many times. The interesting thing is that
> response time is around 109ms at the time. I can increase the load to 900
> and response time will still be around 400 to 600 but CPU stays up.
>
>
> *Question:*
>
> Looking at our requirements and findings above, is the design convenient
> for what’s asked? Are there any tests that I can do to find out more? Is
> there any setting that I need to check (and change)?
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/76ecd8bb-97cc-4125-8f1a-50de69c2790f%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X9C2S2iBF5ohDBpQ304AZD3a6kCnfhgspWw6wDpWHvTCQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Grandchild is not getting fetched by parent id

2015-01-16 Thread Iv Igi
Got my mistake, thank you!
And sorry for missing the man page.

пятница, 16 января 2015 г., 7:05:17 UTC+3 пользователь Masaru Hasegawa 
написал:
>
> Hi Iv, 
>
> You’d need to specify both parent and routing when you index grand 
> children. 
> See 
> http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/grandparents.html
>  
>
>
> Masaru 
>
> On January 15, 2015 at 20:44:43, Iv Igi (sayo...@gmail.com ) 
> wrote: 
> > I am experiencing an issue while trying to retrieve a grandchild record 
> by 
> > its parent ID. (child-grandchild relationship) 
> > The amount of hits in result is always zero. 
> > Also the same request is working fine for parent-child relationship. 
> >   
> > My records are getting organized kinda like this: 
> >   
> > Account --(one to one)--> User --(one to one)--> Address 
> >   
> > My execution environment is: 
> > - Fedora 21 CE 
> > - openjdk 1.8.0_25 
> > - ES 1.4.2 
> >   
> > Here is a script that is showing the problem 
> >   
> > # index creation 
> > curl -XPUT "localhost:9200/the_index/" -d "{ 
> > \"mappings\": { 
> > \"account\" : {}, 
> > \"user\" : { 
> > \"_parent\" : { 
> > \"type\" : \"account\" 
> > } 
> > }, 
> > \"address\" : { 
> > \"_parent\" : { 
> > \"type\" : \"user\" 
> > } 
> > } 
> > } 
> > }"; 
> >   
> > # mrsmith account creation 
> > curl -XPUT "localhost:9200/the_index/account/mrsmith" -d "{ 
> > \"foo\" : \"foo\" 
> > }"; 
> >   
> > # john user creation 
> > curl -XPUT "localhost:9200/the_index/user/john?parent=mrsmith" -d "{ 
> > \"bar\" : \"bar\" 
> > }"; 
> >   
> > # john user creation 
> > curl -XPUT "localhost:9200/the_index/address/smithshouse?parent=john" -d 
> "{   
> > \"baz\" : \"baz\" 
> > }"; 
> >   
> > # Here I am trying to retrieve a record. Getting zero hits. 
> > curl -XGET "localhost:9200/the_index/address/_search?pretty" -d "{ 
> > \"query\" : { \"bool\" : { \"must\" : { \"term\" : { \"_parent\" : 
> > \"john\" } } } } 
> > }"; 
> >   
> > # Another approach with has_parent query type. Still getting zero hits. 
> > curl -XGET "localhost:9200/the_index/address/_search?pretty" -d "{ 
> > \"query\" : { 
> > \"has_parent\" : { 
> > \"parent_type\" : \"user\", 
> > \"query\" : { 
> > \"term\" : { 
> > \"_id\" : \"john\" 
> > } 
> > } 
> > } 
> > } 
> > }"; 
> >   
> > # OK, lets try a routed search. Nope 
> > curl -XGET 
> "localhost:9200/the_index/address/_search?routing=john&pretty"   
> > -d "{ 
> > \"query\" : { \"bool\" : { \"must\" : { \"term\" : { \"_parent\" : 
> > \"john\" } } } } 
> > }"; 
> >   
> > # Routed has_parent query. Same 
> > curl -XGET 
> "localhost:9200/the_index/address/_search?routing=john&pretty"   
> > -d "{ 
> > \"query\" : { 
> > \"has_parent\" : { 
> > \"parent_type\" : \"user\", 
> > \"query\" : { 
> > \"term\" : { 
> > \"_id\" : \"john\" 
> > } 
> > } 
> > } 
> > } 
> > }"; 
> >   
> > # Retrieving a record by itself. Going just fine. 
> > curl -XGET "localhost:9200/the_index/address/smithshouse?parent=john"; 
> >   
> > # Querying for user record with the same query. Got a hit. 
> > curl -XGET "localhost:9200/the_index/user/_search?pretty" -d "{ 
> > \"query\" : { \"bool\" : { \"must\" : { \"term\" : { \"_parent\" : 
> > \"mrsmith\" } } } } 
> > }"; 
> >   
> >   
> >   
> > The output: 
> >   
> > {"acknowledged":true} 
> > 
> {"_index":"the_index","_type":"account","_id":"mrsmith","_version":1,"created":true}{"_index":"the_index","_type":"user","_id":"john","_version":1,"created":true}{"_index":"the_index","_type":"address","_id":"smithshouse","_version":1,"created":true}
>  
>   
> > { 
> > "took" : 54, 
> > "timed_out" : false, 
> > "_shards" : { 
> > "total" : 5, 
> > "successful" : 5, 
> > "failed" : 0 
> > }, 
> > "hits" : { 
> > "total" : 0, 
> > "max_score" : null, 
> > "hits" : [ ] 
> > } 
> > } 
> > { 
> > "took" : 221, 
> > "timed_out" : false, 
> > "_shards" : { 
> > "total" : 5, 
> > "successful" : 5, 
> > "failed" : 0 
> > }, 
> > "hits" : { 
> > "total" : 0, 
> > "max_score" : null, 
> > "hits" : [ ] 
> > } 
> > } 
> > { 
> > "took" : 35, 
> > "timed_out" : false, 
> > "_shards" : { 
> > "total" : 1, 
> > "successful" : 1, 
> > "failed" : 0 
> > }, 
> > "hits" : { 
> > "total" : 0, 
> > "max_score" : null, 
> > "hits" : [ ] 
> > } 
> > } 
> > { 
> > "took" : 481, 
> > "timed_out" : false, 
> > "_shards" : { 
> > "total" : 1, 
> > "successful" : 1, 
> > "failed" : 0 
> > }, 
> > "hits" : { 
> > "total" : 0, 
> > "max_score" : null, 
> > "hits" : [ ] 
> > } 
> > } 
> > 
> {"_index":"the_index","_type":"address","_id":"smithshouse","_version":1,"found":true,"_source":{
>  
>   
> > "baz" : "baz" 
> > }} 
> > { 
> > "took" : 65, 
> > "timed_out" : false, 
> > "_shards" : { 
> > "total" : 5, 
> > "successful" : 5, 
> > "failed" : 0 
> > }, 
> > "hits" : { 
> > "total" : 1, 
> > "max_score" : 1.0, 
> > "hits" : [ { 
> > "_index" : "the_index", 
> > "_type" : "user", 
> > "_id" : "john", 
> > "_score" : 1.0, 
> > "_source":{ 
> > "bar" : "bar" 
> > } 
> > } ] 
> > } 
> > } 
> >   

Re: Migrating lucene drill sideways query to elasticsearch

2015-01-16 Thread joergpra...@gmail.com
I do not think you have to worry, I use a dozen of aggregations with
filters with success on 50m docs with 8G RAM and 3 nodes. But if your tests
show a massive slowdown, you should come back with your findings including
performance numbers, so the ES core team can have a look at it.

Jörg

On Fri, Jan 16, 2015 at 7:35 PM, Bo Finnerup Madsen 
wrote:

> Hi Mike,
>
> Thanks, that is in line with what Jörg suggested. I have updated the gist
> with a search using this approach and it gives the correct result. However
> I am a bit concerned about the cost of this, as we will be running 4-5
> facets each of which will require their own set of filters.
> But if it is the recommended way, I will try to implemented it and run a
> performance test :)
>
> Den fredag den 16. januar 2015 kl. 18.13.34 UTC+1 skrev Michael McCandless:
>>
>> I think you must do separate filters to compute the sideways facet counts.
>>
>> Mike McCandless
>>
>> http://blog.mikemccandless.com
>>
>> On Fri, Jan 16, 2015 at 10:15 AM, Bo Finnerup Madsen > > wrote:
>>
>>> Hi,
>>>
>>> I am trying to migrate a project from Lucene to elasticsearch, and for
>>> the most part it is a pleasure :)
>>> However, I cannot wrap my head around how to recreate the drill sideways
>>> queries we currently use in Lucene.
>>>
>>> The scenario is a basic search page with a free text search and a bunch
>>> of drill down/sideways facets. In Lucene, the hits that we get for each
>>> facet, is a correct representation of how many results we would get if that
>>> facet is used as a limit, but I am unable to do this in elasticsearch...
>>>
>>> As an example (full gist available here: https://gist.github.com/
>>> bogundersen/e9bac02779e1c4a089dc)
>>>
>>> I have three items:
>>> Item 1:
>>>   language : en_GB,
>>>   year: 2013,
>>>   author: [ John, Paul ]
>>> Item 2:
>>>   language : en_GB,
>>>   year: 2012,
>>>   author: [ John, George ]
>>> Item 3:
>>>   language : da_DK,
>>>   year: 2012,
>>>   author: [ Ringo ]
>>>
>>> Now lets imagine that the user limits to year 2012. If I just include
>>> the facet in the query ("Search 2" in the gist), I would get the following
>>> facets:
>>> year
>>>   2012 : 2
>>> author
>>>   George 1
>>>   John   1
>>>   Ringo  1
>>> language
>>>   da_DK  1
>>>   en_GB  1
>>> The author and language facets show the correct numbers, but the year
>>> facet only shows year 2012 thereby not allowing the user to select another
>>> year without deselecting 2012.
>>>
>>> A way around this is to use post filters ("Search 3" in the gist), using
>>> those I get the following facet results:
>>> year
>>>   2012 : 2
>>>   2013 : 1
>>> author
>>>   John   2
>>>   George 1
>>>   Paul   1
>>>   Ringo  1
>>> language
>>>   en_GB  2
>>>   da_DK  1
>>> Here the user is still presented with other years, but the numbers for
>>> author and language are not correct (e.g. selecting "John" will only give 1
>>> result, and not two)
>>>
>>> The only way I can think of to make this work, is to do separate queries
>>> for each facet, but that seems counter intuitive and not very performance
>>> friendly. Any ideas on how to do this in elasticsearch?
>>>
>>> --
>>> Bo Madsen
>>>
>>>  --
>>> You received this message because you are subscribed to the Google
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to elasticsearc...@googlegroups.com.
>>> To view this discussion on the web visit https://groups.google.com/d/
>>> msgid/elasticsearch/2e97801f-a091-4f1d-8e31-1ffb777f287c%
>>> 40googlegroups.com
>>> 
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/ce635bd0-7295-4369-96d9-9d60d7578e8a%40googlegroups.com
> 
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoETUF%2BbLo9z4WtFGTSCpJL9A0c0Lcys5gtrxyQGn_VEpA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Using icu_collation plugin in Unit Tests

2015-01-16 Thread joergpra...@gmail.com
You have to include transitive dependencies also, here Lucene ICU. Most
IDEs do this by default.

Jörg

On Fri, Jan 16, 2015 at 9:33 PM, Kumar S  wrote:

> Hi Jorg,
> Thanks!
>
> I get NoClassDefFound:
> org/apache/lucene/analysis/icu/segmentation/icutokenizer
>
> Thanks,
> Kumar
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoF117PVWtJeNvsq%3D%3DWjB8tcyhgD0NLKuCrkeJG3s58Hxg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Using icu_collation plugin in Unit Tests

2015-01-16 Thread Kumar S
I am using ES 1.4.2 & ES-analysis-icu 2.4.1

On Friday, January 16, 2015 at 12:33:00 PM UTC-8, Kumar S wrote:
>
> Hi Jorg,
> Thanks!
>
> I get NoClassDefFound: 
> org/apache/lucene/analysis/icu/segmentation/icutokenizer
>
> Thanks,
> Kumar
>
> On Friday, January 16, 2015 at 12:00:45 AM UTC-8, Jörg Prante wrote:
>>
>> You don't need to manually download the jar file if you use Maven. Add 
>> the jar as dependency to your pom.xml
>>
>> 
>> org.elasticsearch
>> elasticsearch-analysis-icu
>> 2.4.1
>> 
>>
>> Jörg
>>
>> On Thu, Jan 15, 2015 at 10:47 PM, Kumar S  wrote:
>>
>>> Thanks David!
>>>
>>> Sorry for being a new one in the ES world. But where would i download 
>>> the JAR file from and what calss should i be using for the icu_collation?
>>>
>>> Thank you very much,
>>> Kumar Subramanian,
>>>
>>> On Thursday, January 15, 2015 at 12:52:12 PM UTC-8, David Pilato wrote:

 You most likely just need to add it as a dependency. Which is easy if 
 you are using maven.

 David

 Le 15 janv. 2015 à 21:03, Kumar S  a écrit :

 Hi,
 I am new to ES. I am using NodeBuilder in my unit test to run a local 
 instance of ES. I would like to use the icu_collation plugin. How can i 
 install and run the plugin form within this local instance. Is there API 
 that i should use? if not, what are the different ways i can do this?

 Thank you very much,
 Kumar Subramanian.

 -- 
 You received this message because you are subscribed to the Google 
 Groups "elasticsearch" group.
 To unsubscribe from this group and stop receiving emails from it, send 
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/5f3ebc39-4c13-4d1b-a888-bd101ab46136%
 40googlegroups.com 
 
 .
 For more options, visit https://groups.google.com/d/optout.

  -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to elasticsearc...@googlegroups.com.
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/elasticsearch/5a5e82b3-038b-4251-ae2c-f2216dc991f0%40googlegroups.com
>>>  
>>> 
>>> .
>>>
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a5469c6a-d011-4923-a5f0-6ec9ebcfb6e3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Identity/Autonumber ID's

2015-01-16 Thread Nicolai Kamenzky
For better performance the lib fetches 100 ids at a time and caches those. 
This way elasticsearch is only queried every 100th get call.

Do you call init before every get call? This way the cache would be cleared 
each time. init only needs to be called once on startup.


Am Freitag, 16. Januar 2015 14:09:22 UTC-5 schrieb Энхтөр Энхбат:
>
> hi, i tried ur module, it is useful & nice. But why sequence version 
> increments by 100? is it blank updating 100 times? 
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c86dc437-3234-40e2-b5d3-68f46bc3545c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Using icu_collation plugin in Unit Tests

2015-01-16 Thread Kumar S
Hi Jorg,
Thanks!

I get NoClassDefFound: 
org/apache/lucene/analysis/icu/segmentation/icutokenizer

Thanks,
Kumar

On Friday, January 16, 2015 at 12:00:45 AM UTC-8, Jörg Prante wrote:
>
> You don't need to manually download the jar file if you use Maven. Add the 
> jar as dependency to your pom.xml
>
> 
> org.elasticsearch
> elasticsearch-analysis-icu
> 2.4.1
> 
>
> Jörg
>
> On Thu, Jan 15, 2015 at 10:47 PM, Kumar S  > wrote:
>
>> Thanks David!
>>
>> Sorry for being a new one in the ES world. But where would i download the 
>> JAR file from and what calss should i be using for the icu_collation?
>>
>> Thank you very much,
>> Kumar Subramanian,
>>
>> On Thursday, January 15, 2015 at 12:52:12 PM UTC-8, David Pilato wrote:
>>>
>>> You most likely just need to add it as a dependency. Which is easy if 
>>> you are using maven.
>>>
>>> David
>>>
>>> Le 15 janv. 2015 à 21:03, Kumar S  a écrit :
>>>
>>> Hi,
>>> I am new to ES. I am using NodeBuilder in my unit test to run a local 
>>> instance of ES. I would like to use the icu_collation plugin. How can i 
>>> install and run the plugin form within this local instance. Is there API 
>>> that i should use? if not, what are the different ways i can do this?
>>>
>>> Thank you very much,
>>> Kumar Subramanian.
>>>
>>> -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to elasticsearc...@googlegroups.com.
>>> To view this discussion on the web visit https://groups.google.com/d/
>>> msgid/elasticsearch/5f3ebc39-4c13-4d1b-a888-bd101ab46136%
>>> 40googlegroups.com 
>>> 
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>>  -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/5a5e82b3-038b-4251-ae2c-f2216dc991f0%40googlegroups.com
>>  
>> 
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c09c21a7-0a08-4038-8d4a-693f52dc311d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Using icu_collation plugin in Unit Tests

2015-01-16 Thread Kumar S
Hi Jorg,
Thanks!
I added ES-analysis-icu 2.4.1 as a dependency. I get NoClassDefFound 
org.elasticsearch.common.inject.CreationException: Guice creation errors:

1) Error injecting constructor, java.lang.NoClassDefFoundError: 
org/apache/lucene/analysis/icu/segmentation/ICUTokenizer
  at org.elasticsearch.indices.analysis.IcuIndicesAnalysis.(Unknown 
Source)
  while locating org.elasticsearch.indices.analysis.IcuIndicesAnalysis

1 error
at 
org.elasticsearch.common.inject.internal.Errors.throwCreationExceptionIfErrorsExist(Errors.java:344)
at 
org.elasticsearch.common.inject.InjectorBuilder.injectDynamically(InjectorBuilder.java:178)
at 
org.elasticsearch.common.inject.InjectorBuilder.build(InjectorBuilder.java:110)
at org.elasticsearch.common.inject.Guice.createInjector(Guice.java:93)
at org.elasticsearch.common.inject.Guice.createInjector(Guice.java:70)
at 
org.elasticsearch.common.inject.ModulesBuilder.createInjector(ModulesBuilder.java:59)
at 
org.elasticsearch.node.internal.InternalNode.(InternalNode.java:197)
at org.elasticsearch.node.NodeBuilder.build(NodeBuilder.java:159)
at org.elasticsearch.node.NodeBuilder.node(NodeBuilder.java:166)
at com.amazon.clouddrive.elasticsearch.TestBase.setupES(TestBase.java:22)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:27)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30)
at org.junit.runners.ParentRunner.run(ParentRunner.java:300)
at 
org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50)
at 
org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
at 
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)
at 
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)
at 
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
at 
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)
Caused by: java.lang.NoClassDefFoundError: 
org/apache/lucene/analysis/icu/segmentation/ICUTokenizer
at 
org.elasticsearch.indices.analysis.IcuIndicesAnalysis.(IcuIndicesAnalysis.java:51)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at 
org.elasticsearch.common.inject.DefaultConstructionProxyFactory$1.newInstance(DefaultConstructionProxyFactory.java:54)
at 
org.elasticsearch.common.inject.ConstructorInjector.construct(ConstructorInjector.java:86)
at 
org.elasticsearch.common.inject.ConstructorBindingImpl$Factory.get(ConstructorBindingImpl.java:98)
at 
org.elasticsearch.common.inject.ProviderToInternalFactoryAdapter$1.call(ProviderToInternalFactoryAdapter.java:45)
at 
org.elasticsearch.common.inject.InjectorImpl.callInContext(InjectorImpl.java:837)
at 
org.elasticsearch.common.inject.ProviderToInternalFactoryAdapter.get(ProviderToInternalFactoryAdapter.java:42)
at org.elasticsearch.common.inject.Scopes$1$1.get(Scopes.java:57)
at 
org.elasticsearch.common.inject.InternalFactoryToProviderAdapter.get(InternalFactoryToProviderAdapter.java:45)
at 
org.elasticsearch.common.inject.InjectorBuilder$1.call(InjectorBuilder.java:200)
at 
org.elasticsearch.common.inject.InjectorBuilder$1.call(InjectorBuilder.java:193)
at 
org.elasticsearch.common.inject.InjectorImpl.callInContext(InjectorImpl.java:830)
at 
org.elasticsearch.common.inject.InjectorBuilder.loadEagerSingletons(InjectorBuilder.java:193)
at 
org.elasticsearch.common.inject.InjectorBuilder.injectDynamically(InjectorBuilder.java:175)
... 24 more
Caused by: java.lang.ClassNotFoundException: 
org.apache.lucene.analysis.icu.segmentation.ICUTokenizer
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
... 42 more

java.lang.NullPointerException
at com.TestBase.teardownES(TestBase.java:27)
at sun.reflect.

Re: Cached Elasticsearch Information

2015-01-16 Thread Albion Baucom
No configuration management in place. I manually untarred the files. So all 
configuration information should be contained in the application directory? 

Here is what I see which is, even after a reboot, a Nightwatch named 
Elasticsearch instance looking for the phantom "Plantman" master, even 
though this node is configured to be a master only:

[root@es-master elasticsearch-1.4.2]# 
[2015-01-16 11:57:43,500][INFO ][node ] [*Nightwatch*] 
version[1.4.2], pid[24139], build[927caff/2014-12-16T14:11:12Z]
[2015-01-16 11:57:43,501][INFO ][node ] [Nightwatch] 
initializing ...
[2015-01-16 11:57:43,505][INFO ][plugins  ] [Nightwatch] 
loaded [], sites []
[2015-01-16 11:57:46,701][INFO ][node ] [Nightwatch] 
initialized
[2015-01-16 11:57:46,701][INFO ][node ] [Nightwatch] 
starting ...
[2015-01-16 11:57:46,903][INFO ][transport] [Nightwatch] 
bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address 
{inet[/10.0.0.45:9300]}
[2015-01-16 11:57:46,919][INFO ][discovery] [Nightwatch] 
lsflogs/A6CISFuzRkK31rL2dFFdNA
[2015-01-16 11:57:50,187][INFO ][discovery.zen] [Nightwatch] 
*failed 
to send join request to master 
[[Plantman]*[VOmbX4yORHiObk7Q3D7tbQ][es-master][inet[/10.0.0.45:9300]]{data=false,
 
master=true}], reason 
[RemoteTransportException[[Nightwatch][inet[/10.0.0.45:9300]][internal:discovery/zen/join]];
 
nested: ElasticsearchIllegalStateException[Node 
[[Nightwatch][A6CISFuzRkK31rL2dFFdNA][es-master][inet[/10.0.0.45:9300]]{data=false,
 
master=true}] not master for join request from 
[[Nightwatch][A6CISFuzRkK31rL2dFFdNA][es-master][inet[/10.0.0.45:9300]]{data=false,
 
master=true}]]; ], tried [3] times

My workaround was to rename the cluster. Perhaps there was another node 
with that information and it was confusing the master? 

But you answered my question which is there is no information written 
outside of the application directory that could cache settings for future 
clusters if I tear them down and rebuild them.

Thanks
Albion


On Wednesday, January 14, 2015 at 6:42:23 PM UTC-8, Mark Walkom wrote:
>
> It doesn't cache this sort of info, it'll read what is in the config file.
>
> Are you using puppet/chef/other for config management perchance? These 
> could be over writing your config.
>
> On 15 January 2015 at 06:22, Albion Baucom > 
> wrote:
>
>> I am new to ELK and I am still using a dev environment with real data to 
>> understand how the pipeline works.
>>
>> Recently I had nodes networking between them that were part of a 4 node 
>> cluster: 1 master node, no data and 3 data-only nodes. These were working 
>> fine up to the point that they lost connectivity between themselves. I am 
>> using a unicast cluster setup as multicast was not working on my OpenStack 
>> cluster (a question for another future post).
>>
>> When I rebooted the nodes and tried to bring the master back up it tried 
>> to join the previous master instance. Clearly there is information cached 
>> about the previous running instance that needs to be flushed. As this is a 
>> dev environment, I copied the config file and blew away the elastic search 
>> directory, copied the config back and tried to restart elasticsearch. 
>> Curiously it is still trying to join the old master, even though no other 
>> processes are running. Obviously this cached data lives somewhere else on 
>> the system and I have yet to find it.
>>
>> Perhaps someone can point me in the right direction here.
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/90187018-dde9-447d-b211-d806fd46cec8%40googlegroups.com
>>  
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/12876abb-4c7c-4ef8-934f-d0afac8b8c3b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Problem with date_histogram

2015-01-16 Thread Виктор Булдаков
Thank you!

пятница, 16 января 2015 г., 20:52:32 UTC+3 пользователь David Pilato 
написал:
>
> Thanks for the suggestion.
>
> I just opened an issue here to track this: 
> https://github.com/elasticsearch/elasticsearch/issues/9338
> I don’t know yet why we did not upgrade it.
>
> -- 
> *David Pilato* | *Technical Advocate* | *Elasticsearch.com 
> *
> @dadoonet  | @elasticsearchfr 
>  | @scrutmydocs 
> 
>
>
>  
> Le 16 janv. 2015 à 18:26, Виктор Булдаков  > a écrit :
>
> Hi!
>
> Could you tell me please when joda-time will be updated to version more 
> then 2.5? 
>
> Date aggregation works wrong now because joda-time before 2.5 includes old 
> tzdata.
>
> Thanks!
>
> -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearc...@googlegroups.com .
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/53e5872d-4f36-4c73-ac19-e806c761771f%40googlegroups.com
>  
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1e8ac725-098e-4c1f-9bfc-d8260ab1d250%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Identity/Autonumber ID's

2015-01-16 Thread Энхтөр Энхбат
hi, i tried ur module, it is useful & nice. But why sequence version 
increments by 100? is it blank updating 100 times?

On Sunday, March 30, 2014 at 6:36:59 PM UTC+8, Nicolai Kamenzky wrote:
>
> Hi Daniel, do you use the _version technique now? Is it working well for 
> you?
>
> BTW: I created a node.js module 
>  for it.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/beb97bb6-874b-4e08-abb3-04b4e6438119%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Migrating lucene drill sideways query to elasticsearch

2015-01-16 Thread Bo Finnerup Madsen
Hi Mike,

Thanks, that is in line with what Jörg suggested. I have updated the gist 
with a search using this approach and it gives the correct result. However 
I am a bit concerned about the cost of this, as we will be running 4-5 
facets each of which will require their own set of filters.
But if it is the recommended way, I will try to implemented it and run a 
performance test :)

Den fredag den 16. januar 2015 kl. 18.13.34 UTC+1 skrev Michael McCandless:
>
> I think you must do separate filters to compute the sideways facet counts.
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
> On Fri, Jan 16, 2015 at 10:15 AM, Bo Finnerup Madsen  > wrote:
>
>> Hi,
>>
>> I am trying to migrate a project from Lucene to elasticsearch, and for 
>> the most part it is a pleasure :)
>> However, I cannot wrap my head around how to recreate the drill sideways 
>> queries we currently use in Lucene.
>>
>> The scenario is a basic search page with a free text search and a bunch 
>> of drill down/sideways facets. In Lucene, the hits that we get for each 
>> facet, is a correct representation of how many results we would get if that 
>> facet is used as a limit, but I am unable to do this in elasticsearch...
>>
>> As an example (full gist available here: 
>> https://gist.github.com/bogundersen/e9bac02779e1c4a089dc)
>>
>> I have three items:
>> Item 1:
>>   language : en_GB,
>>   year: 2013,
>>   author: [ John, Paul ]
>> Item 2:
>>   language : en_GB,
>>   year: 2012,
>>   author: [ John, George ]
>> Item 3:
>>   language : da_DK,
>>   year: 2012,
>>   author: [ Ringo ]
>>
>> Now lets imagine that the user limits to year 2012. If I just include the 
>> facet in the query ("Search 2" in the gist), I would get the following 
>> facets:
>> year
>>   2012 : 2
>> author
>>   George 1
>>   John   1
>>   Ringo  1
>> language
>>   da_DK  1
>>   en_GB  1
>> The author and language facets show the correct numbers, but the year 
>> facet only shows year 2012 thereby not allowing the user to select another 
>> year without deselecting 2012.
>>
>> A way around this is to use post filters ("Search 3" in the gist), using 
>> those I get the following facet results:
>> year
>>   2012 : 2
>>   2013 : 1
>> author
>>   John   2
>>   George 1
>>   Paul   1
>>   Ringo  1
>> language
>>   en_GB  2
>>   da_DK  1
>> Here the user is still presented with other years, but the numbers for 
>> author and language are not correct (e.g. selecting "John" will only give 1 
>> result, and not two)
>>
>> The only way I can think of to make this work, is to do separate queries 
>> for each facet, but that seems counter intuitive and not very performance 
>> friendly. Any ideas on how to do this in elasticsearch?
>>
>> -- 
>> Bo Madsen
>>
>>  -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/2e97801f-a091-4f1d-8e31-1ffb777f287c%40googlegroups.com
>>  
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ce635bd0-7295-4369-96d9-9d60d7578e8a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Percolator Memory Usage -- 10-1 Disk-Memory Usage. Why?

2015-01-16 Thread kali.seo
I experienced the same issue since few days ! With a large data cluster. 
Desactivating percolator queries reduced immediatly thé garbage collector issue.

I opened an issue on github to know more about that, cause i really need this 
awsome functionnality !

I spent lot of time trying to optimize garbage for nothing...

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c31fad76-b77e-4699-9eb4-af855934530a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Migrating lucene drill sideways query to elasticsearch

2015-01-16 Thread Bo Finnerup Madsen
Hi Jörg,

That might actually do the trick. I have updated the gist ( 
https://gist.github.com/bogundersen/e9bac02779e1c4a089dc) with a "Search 4" 
which uses this method. It gives the expected results, so that is good :) 
How about the cost of this? We will be doing this for 4-5 facets, and using 
this method they will all be computed using their own set of filters...

Den fredag den 16. januar 2015 kl. 16.46.46 UTC+1 skrev Jörg Prante:
>
> Have you considered to use filters / filter buckets like described in the 
> guide?
>
>
> http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/_filtering_queries_and_aggregations.html
>
> Jörg
>
> On Fri, Jan 16, 2015 at 4:15 PM, Bo Finnerup Madsen  > wrote:
>
>> Hi,
>>
>> I am trying to migrate a project from Lucene to elasticsearch, and for 
>> the most part it is a pleasure :)
>> However, I cannot wrap my head around how to recreate the drill sideways 
>> queries we currently use in Lucene.
>>
>> The scenario is a basic search page with a free text search and a bunch 
>> of drill down/sideways facets. In Lucene, the hits that we get for each 
>> facet, is a correct representation of how many results we would get if that 
>> facet is used as a limit, but I am unable to do this in elasticsearch...
>>
>> As an example (full gist available here: 
>> https://gist.github.com/bogundersen/e9bac02779e1c4a089dc)
>>
>> I have three items:
>> Item 1:
>>   language : en_GB,
>>   year: 2013,
>>   author: [ John, Paul ]
>> Item 2:
>>   language : en_GB,
>>   year: 2012,
>>   author: [ John, George ]
>> Item 3:
>>   language : da_DK,
>>   year: 2012,
>>   author: [ Ringo ]
>>
>> Now lets imagine that the user limits to year 2012. If I just include the 
>> facet in the query ("Search 2" in the gist), I would get the following 
>> facets:
>> year
>>   2012 : 2
>> author
>>   George 1
>>   John   1
>>   Ringo  1
>> language
>>   da_DK  1
>>   en_GB  1
>> The author and language facets show the correct numbers, but the year 
>> facet only shows year 2012 thereby not allowing the user to select another 
>> year without deselecting 2012.
>>
>> A way around this is to use post filters ("Search 3" in the gist), using 
>> those I get the following facet results:
>> year
>>   2012 : 2
>>   2013 : 1
>> author
>>   John   2
>>   George 1
>>   Paul   1
>>   Ringo  1
>> language
>>   en_GB  2
>>   da_DK  1
>> Here the user is still presented with other years, but the numbers for 
>> author and language are not correct (e.g. selecting "John" will only give 1 
>> result, and not two)
>>
>> The only way I can think of to make this work, is to do separate queries 
>> for each facet, but that seems counter intuitive and not very performance 
>> friendly. Any ideas on how to do this in elasticsearch?
>>
>> -- 
>> Bo Madsen
>>
>>  -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/2e97801f-a091-4f1d-8e31-1ffb777f287c%40googlegroups.com
>>  
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/bceb2634-df14-4e3a-bd64-a2e3158a4592%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Problem with date_histogram

2015-01-16 Thread David Pilato
Thanks for the suggestion.

I just opened an issue here to track this: 
https://github.com/elasticsearch/elasticsearch/issues/9338 

I don’t know yet why we did not upgrade it.

-- 
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet  | @elasticsearchfr 
 | @scrutmydocs 




> Le 16 janv. 2015 à 18:26, Виктор Булдаков  a écrit 
> :
> 
> Hi!
> 
> Could you tell me please when joda-time will be updated to version more then 
> 2.5? 
> 
> Date aggregation works wrong now because joda-time before 2.5 includes old 
> tzdata.
> 
> Thanks!
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearch+unsubscr...@googlegroups.com 
> .
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/53e5872d-4f36-4c73-ac19-e806c761771f%40googlegroups.com
>  
> .
> For more options, visit https://groups.google.com/d/optout 
> .

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/D3969DFE-F4B0-425C-B4B9-9FDDC8BDD8B1%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.


Re: Slow Commands with 1.2.4 to 1.4.2 Upgrade

2015-01-16 Thread pskieu
My ES_HEAP_SIZE is already set to 24g per node

On Friday, January 16, 2015 at 4:25:00 AM UTC-8, Arie wrote:
>
> Check your memory usage on ES. In my case I had to specifically put 
> ES_HEAP_SIZE in /etc/init.d/elasticsearch
> to get it working the right way.
>
> On Monday, January 5, 2015 at 7:45:12 PM UTC+1, psk...@gmail.com wrote:
>>
>> It takes upwards an average of 10 to 30 seconds. This is a test instance, 
>> so there's no additional load other than what I'm doing.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c163ad96-23df-435f-a2a0-ae82687039a4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Problem with date_histogram

2015-01-16 Thread Виктор Булдаков
Hi!

Could you tell me please when joda-time will be updated to version more 
then 2.5? 

Date aggregation works wrong now because joda-time before 2.5 includes old 
tzdata.

Thanks!

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/53e5872d-4f36-4c73-ac19-e806c761771f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: scrolling and lucene segments

2015-01-16 Thread Michael McCandless
The segments are effectively ref counted, so once the last scroll still
using an old (already merged away) segment is deleted, it will be removed.

Mike McCandless

http://blog.mikemccandless.com

On Fri, Jan 16, 2015 at 4:15 AM, Jason Wee  wrote:

>
> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-scroll.html
>
> Normally, the background merge process optimizes the index by merging
> together smaller segments to create new bigger segments, at which time the
> smaller segments are deleted. This process continues during scrolling, but
> an open search context prevents the old segments from being deleted while
> they are still in use. This is how Elasticsearch is able to return the
> results of the initial search request, regardless of subsequent changes to
> documents.
>
> Tip
> Keeping older segments alive means that more file handles are needed.
> Ensure that you have configured your nodes to have ample free file handles.
> See the section called “File Descriptorsedit”.
>
> Hello,
>
> Read the above description, can anyone tell what happened to the segments
> after the scroll time expired? Does the segments will automatically merge?
> What if a lot (like 50 active) of scroll happened and how will it impact
> the lucene segment/elasticsearch? comments?
>
> Jason
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/2971e92d-397f-4d4b-a6da-0428a6cde638%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAD7smRfPHTg6K7A9jn%3D3nT26hjmCBUkydzFZzyJNn8pxv8g%3DdA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Migrating lucene drill sideways query to elasticsearch

2015-01-16 Thread Michael McCandless
I think you must do separate filters to compute the sideways facet counts.

Mike McCandless

http://blog.mikemccandless.com

On Fri, Jan 16, 2015 at 10:15 AM, Bo Finnerup Madsen  wrote:

> Hi,
>
> I am trying to migrate a project from Lucene to elasticsearch, and for the
> most part it is a pleasure :)
> However, I cannot wrap my head around how to recreate the drill sideways
> queries we currently use in Lucene.
>
> The scenario is a basic search page with a free text search and a bunch of
> drill down/sideways facets. In Lucene, the hits that we get for each facet,
> is a correct representation of how many results we would get if that facet
> is used as a limit, but I am unable to do this in elasticsearch...
>
> As an example (full gist available here:
> https://gist.github.com/bogundersen/e9bac02779e1c4a089dc)
>
> I have three items:
> Item 1:
>   language : en_GB,
>   year: 2013,
>   author: [ John, Paul ]
> Item 2:
>   language : en_GB,
>   year: 2012,
>   author: [ John, George ]
> Item 3:
>   language : da_DK,
>   year: 2012,
>   author: [ Ringo ]
>
> Now lets imagine that the user limits to year 2012. If I just include the
> facet in the query ("Search 2" in the gist), I would get the following
> facets:
> year
>   2012 : 2
> author
>   George 1
>   John   1
>   Ringo  1
> language
>   da_DK  1
>   en_GB  1
> The author and language facets show the correct numbers, but the year
> facet only shows year 2012 thereby not allowing the user to select another
> year without deselecting 2012.
>
> A way around this is to use post filters ("Search 3" in the gist), using
> those I get the following facet results:
> year
>   2012 : 2
>   2013 : 1
> author
>   John   2
>   George 1
>   Paul   1
>   Ringo  1
> language
>   en_GB  2
>   da_DK  1
> Here the user is still presented with other years, but the numbers for
> author and language are not correct (e.g. selecting "John" will only give 1
> result, and not two)
>
> The only way I can think of to make this work, is to do separate queries
> for each facet, but that seems counter intuitive and not very performance
> friendly. Any ideas on how to do this in elasticsearch?
>
> --
> Bo Madsen
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/2e97801f-a091-4f1d-8e31-1ffb777f287c%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAD7smRczzTifRND7XokyddfNH%2B050jBUnn%2ByhCLxe-jtYKpYeQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Performance issues when sending documents to multiple indexes at the same time.

2015-01-16 Thread Nawaaz Soogund


We are experiencing some performance issues or anomalies on a elasticsearch 
specifically on a system we are currently building.

 

*The requirements:*

We need to capture data for multiple of our customers,  who will query and 
report on them on a near real time basis. All the documents received are 
the same format with the same properties and are in a flat structure (all 
fields are of primary type and no nested objects). We want to keep each 
customer’s information separate from each other.

 

*Frequency of data received and queried:*

We receive data for each customer at a fluctuating rate of 200 to 700 
documents per second – with the peak being in the middle of the day.

Queries will be mostly aggregations over around 12 million documents per 
customer – histogram/percentiles to show patterns over time and the 
occasional raw document retrieval to find out what happened a particular 
point in time. We are aiming to serve 50 to 100 customer at varying rates 
of documents inserted – the smallest one could be 20 docs/sec to the 
largest one peaking at 1000 docs/sec for some minutes.

 

*How are we storing the data:*

Each customer has one index per day. For example, if we have 5 customers, 
there will be a total of 35 indexes for the whole week. The reason we break 
it per day is because it is mostly the latest two that get queried with 
occasionally the remaining others. We also do it that way so we can delete 
older indexes independently of customers (some may want to keep 7 days, 
some 14 days’ worth of data)

 

*How we are inserting:*

We are sending data in batches of 10 to 2000 – every second. One document 
is around 900bytes raw.

 

*Environment*

AWS C3-Large – 3 nodes

All indexes are created with 10 shards with 2 replica for the test purposes

Both Elasticsearch 1.3.2 and 1.4.1

 

*What we have noticed:*

If I push data to one index only, Response time starts at 80 to 100ms for 
each batch inserted when the rate of insert is around 100 documents per 
second.  I ramp it up and I can reach 1600 before the rate of insert goes 
to close to 1sec per batch and when I increase it to close to 1700, it will 
hit a wall at some point because of concurrent insertions and the time will 
spiral to 4 or 5 seconds. Saying that, if I reduce the rate of inserts, 
Elasticsearch recovers nicely. CPU usage increases as rate increases.

 

If I push to 2 indexes concurrently, I can reach a total of 1100 and CPU 
goes up to 93% around 900 documents per second.

If I push to 3 indexes concurrently, I can reach a total of 150 and CPU 
goes up to 95 to 97%. I tried it many times. The interesting thing is that 
response time is around 109ms at the time. I can increase the load to 900 
and response time will still be around 400 to 600 but CPU stays up.


*Question:*

Looking at our requirements and findings above, is the design convenient 
for what’s asked? Are there any tests that I can do to find out more? Is 
there any setting that I need to check (and change)? 

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/76ecd8bb-97cc-4125-8f1a-50de69c2790f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Template is missing

2015-01-16 Thread Gabriele Angeli
i guys, I try to put a new template in ES 1.3.6 but i always obtain the 
same result: {"error":"ActionRequestValidationException[Validation Failed: 
1: template is missing;]","status":500}
Someone know something about this error? Thanks in advance

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/bf3dee93-c142-4c55-90fb-afd727f8af98%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: MySQL JDBC river: indexing large arrays

2015-01-16 Thread Stalinko
Thank you Jörg.

Another option could be parsing some strings into arrays. For example:

SELECT movies.*, GROUP_CONCAT(persons.name), GROUP_CONCAT(tags.value)
FROM movies m
JOIN persons on persons.id = movies
JOIN tags on tags.id = movies
GROUP BY m.id

where group-concat fields would be converted into arrays. That way looks to
be much more faster than denormalization.



--
View this message in context: 
http://elasticsearch-users.115913.n3.nabble.com/MySQL-JDBC-river-indexing-large-arrays-tp4069168p4069195.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1421424293521-4069195.post%40n3.nabble.com.
For more options, visit https://groups.google.com/d/optout.


Re: Best pratices for index , search and updates

2015-01-16 Thread Phil
Please create your own mappings and don't rely on type detection.

On Thursday, January 15, 2015 at 6:41:14 PM UTC-5, bvnr wrote:
>
> Am new to the elastic search ...
>
> Can some body throw me ideas about the best practices one should follow to 
> get good performance for index ,search and updates 
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/955a9fb9-c406-413a-befd-1708baa5971c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Migrating lucene drill sideways query to elasticsearch

2015-01-16 Thread joergpra...@gmail.com
Have you considered to use filters / filter buckets like described in the
guide?

http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/_filtering_queries_and_aggregations.html

Jörg

On Fri, Jan 16, 2015 at 4:15 PM, Bo Finnerup Madsen 
wrote:

> Hi,
>
> I am trying to migrate a project from Lucene to elasticsearch, and for the
> most part it is a pleasure :)
> However, I cannot wrap my head around how to recreate the drill sideways
> queries we currently use in Lucene.
>
> The scenario is a basic search page with a free text search and a bunch of
> drill down/sideways facets. In Lucene, the hits that we get for each facet,
> is a correct representation of how many results we would get if that facet
> is used as a limit, but I am unable to do this in elasticsearch...
>
> As an example (full gist available here:
> https://gist.github.com/bogundersen/e9bac02779e1c4a089dc)
>
> I have three items:
> Item 1:
>   language : en_GB,
>   year: 2013,
>   author: [ John, Paul ]
> Item 2:
>   language : en_GB,
>   year: 2012,
>   author: [ John, George ]
> Item 3:
>   language : da_DK,
>   year: 2012,
>   author: [ Ringo ]
>
> Now lets imagine that the user limits to year 2012. If I just include the
> facet in the query ("Search 2" in the gist), I would get the following
> facets:
> year
>   2012 : 2
> author
>   George 1
>   John   1
>   Ringo  1
> language
>   da_DK  1
>   en_GB  1
> The author and language facets show the correct numbers, but the year
> facet only shows year 2012 thereby not allowing the user to select another
> year without deselecting 2012.
>
> A way around this is to use post filters ("Search 3" in the gist), using
> those I get the following facet results:
> year
>   2012 : 2
>   2013 : 1
> author
>   John   2
>   George 1
>   Paul   1
>   Ringo  1
> language
>   en_GB  2
>   da_DK  1
> Here the user is still presented with other years, but the numbers for
> author and language are not correct (e.g. selecting "John" will only give 1
> result, and not two)
>
> The only way I can think of to make this work, is to do separate queries
> for each facet, but that seems counter intuitive and not very performance
> friendly. Any ideas on how to do this in elasticsearch?
>
> --
> Bo Madsen
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/2e97801f-a091-4f1d-8e31-1ffb777f287c%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHMj%2BW3McNw_qh_iN9dQGyqAMTvVimre-g_Q%3D-xzS%2BNmA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Elasticsearch cluster add node via ssh tunnel

2015-01-16 Thread Marc
Hi,

earlier I was trying to add a local elasticsearch node to a cluster via a 
ssh forwarded port.
I mapped one of my cluster machines: localhost:9300 -> remotehost:9300.
Somehow elasticsearch cannot deal with it using unicast or multicast.

Any ideas?

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/9b4bc7e8-ffb2-463a-9214-b396f0b4b25e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Migrating lucene drill sideways query to elasticsearch

2015-01-16 Thread Bo Finnerup Madsen
Hi,

I am trying to migrate a project from Lucene to elasticsearch, and for the 
most part it is a pleasure :)
However, I cannot wrap my head around how to recreate the drill sideways 
queries we currently use in Lucene.

The scenario is a basic search page with a free text search and a bunch of 
drill down/sideways facets. In Lucene, the hits that we get for each facet, 
is a correct representation of how many results we would get if that facet 
is used as a limit, but I am unable to do this in elasticsearch...

As an example (full gist available here: 
https://gist.github.com/bogundersen/e9bac02779e1c4a089dc)

I have three items:
Item 1:
  language : en_GB,
  year: 2013,
  author: [ John, Paul ]
Item 2:
  language : en_GB,
  year: 2012,
  author: [ John, George ]
Item 3:
  language : da_DK,
  year: 2012,
  author: [ Ringo ]

Now lets imagine that the user limits to year 2012. If I just include the 
facet in the query ("Search 2" in the gist), I would get the following 
facets:
year
  2012 : 2
author
  George 1
  John   1
  Ringo  1
language
  da_DK  1
  en_GB  1
The author and language facets show the correct numbers, but the year facet 
only shows year 2012 thereby not allowing the user to select another year 
without deselecting 2012.

A way around this is to use post filters ("Search 3" in the gist), using 
those I get the following facet results:
year
  2012 : 2
  2013 : 1
author
  John   2
  George 1
  Paul   1
  Ringo  1
language
  en_GB  2
  da_DK  1
Here the user is still presented with other years, but the numbers for 
author and language are not correct (e.g. selecting "John" will only give 1 
result, and not two)

The only way I can think of to make this work, is to do separate queries 
for each facet, but that seems counter intuitive and not very performance 
friendly. Any ideas on how to do this in elasticsearch?

-- 
Bo Madsen

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/2e97801f-a091-4f1d-8e31-1ffb777f287c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Unable to write in ElasticSearch using Spark in java (throws java.lang.IncompatibleClassChangeError: Implementing class exception)

2015-01-16 Thread Costin Leau

Hi,

Most likely you have some classes compiled against some old libraries - it 
could even be your jar.
Spark relies on Java serialization so if your classes or library change, you 
need to make sure the update version
is used through-out the entire chain.

Oh, and by the way, you seem to be using es-hadoop 2.1.Beta (and its Java API 
for Spark) but in your classpath rely
on es-hadoop 2.0.

Cheers,

On 1/16/15 10:40 AM, Abhishek Patel wrote:

I am using a simple Java program to index a spark JavaRDD into Elasticsearch. 
My code looks like this -

 SparkConf conf = new 
SparkConf().setAppName("IndexDemo").setMaster("spark://ct-0094:7077");

 conf.set("spark.serializer", 
org.apache.spark.serializer.KryoSerializer.class.getName());
 conf.set("es.index.auto.create", "true");
 conf.set("es.nodes", "192.168.50.103");
 conf.set("es.port", "9200");
 JavaSparkContext sc = new JavaSparkContext(conf);
 
sc.addJar("./target/SparkPOC-0.0.1-SNAPSHOT-jar-with-dependencies.jar");

 String arrayval = "string";
 List data = Arrays.asList(
 new Data(1l, 10l, arrayval+"1"),
 new Data(2l, 20l, arrayval+"2"),
 new Data(3l, 30l, arrayval+"3"),
 new Data(4l, 40l, arrayval+"4"),
 new Data(5l, 50l, arrayval+"5"),
 new Data(6l, 60l, arrayval+"6"),
 new Data(7l, 70l, arrayval+"7"),
 new Data(8l, 80l, arrayval+"8"),
 new Data(9l, 90l, arrayval+"9"),
 new Data(10l, 100l, arrayval+"10")
 );

 JavaRDD javaRDD = sc.parallelize(data);
 saveToEs(javaRDD, "index/type");

Running above codes gives an exception (Stack Trace)-


15/01/16 13:20:41 INFO spark.SecurityManager: Changing view acls to: root

15/01/16 13:20:41 INFO spark.SecurityManager: Changing modify acls to: root
15/01/16 13:20:41 INFO spark.SecurityManager: SecurityManager: authentication 
disabled; ui acls disabled; users with
view permissions: Set(root); users with modify permissions: Set(root)
15/01/16 13:20:41 INFO slf4j.Slf4jLogger: Slf4jLogger started
15/01/16 13:20:41 INFO Remoting: Starting remoting
15/01/16 13:20:41 INFO Remoting: Remoting started; listening on addresses 
:[akka.tcp://sparkDriver@ct-0015:55586]
15/01/16 13:20:41 INFO util.Utils: Successfully started service 'sparkDriver' 
on port 55586.
15/01/16 13:20:41 INFO spark.SparkEnv: Registering MapOutputTracker
15/01/16 13:20:41 INFO spark.SparkEnv: Registering BlockManagerMaster
15/01/16 13:20:41 INFO storage.DiskBlockManager: Created local directory at 
/tmp/spark-local-20150116132041-f924
15/01/16 13:20:41 INFO storage.MemoryStore: MemoryStore started with capacity 
2.3 GB
15/01/16 13:20:41 WARN util.NativeCodeLoader: Unable to load native-hadoop 
library for your platform... using
builtin-java classes where applicable
15/01/16 13:20:41 INFO spark.HttpFileServer: HTTP File server directory is 
/tmp/spark-a65b108f-e131-480a-85b2-ed65650cf991
15/01/16 13:20:42 INFO spark.HttpServer: Starting HTTP Server
15/01/16 13:20:42 INFO server.Server: jetty-8.1.14.v20131031
15/01/16 13:20:42 INFO server.AbstractConnector: Started 
SocketConnector@0.0.0.0:34049
15/01/16 13:20:42 INFO util.Utils: Successfully started service 'HTTP file 
server' on port 34049.
15/01/16 13:20:42 INFO server.Server: jetty-8.1.14.v20131031
15/01/16 13:20:42 INFO server.AbstractConnector: Started 
SelectChannelConnector@0.0.0.0:4040
15/01/16 13:20:42 INFO util.Utils: Successfully started service 'SparkUI' on 
port 4040.
15/01/16 13:20:42 INFO ui.SparkUI: Started SparkUI at http://ct-0015:4040
15/01/16 13:20:42 INFO client.AppClient$ClientActor: Connecting to master 
spark://ct-0094:7077...
15/01/16 13:20:42 INFO cluster.SparkDeploySchedulerBackend: Connected to Spark 
cluster with app ID app-20150116131933-0078
15/01/16 13:20:42 INFO netty.NettyBlockTransferService: Server created on 34762
15/01/16 13:20:42 INFO storage.BlockManagerMaster: Trying to register 
BlockManager
15/01/16 13:20:42 INFO storage.BlockManagerMasterActor: Registering block 
manager ct-0015:34762 with 2.3 GB RAM,
BlockManagerId(, ct-0015, 34762)
15/01/16 13:20:42 INFO storage.BlockManagerMaster: Registered BlockManager
15/01/16 13:20:42 INFO cluster.SparkDeploySchedulerBackend: SchedulerBackend is 
ready for scheduling beginning after
reached minRegisteredResourcesRatio: 0.0
15/01/16 13:20:43 INFO spark.SparkContext: Added JAR 
./target/SparkPOC-0.0.1-SNAPSHOT-jar-with-dependencies.jar at
http://192.168.50.103:34049/jars/SparkPOC-0.0.1-SNAPSHOT-jar-with-dependencies.jar
 with timestamp 1421394643161
Exception in thread "main" java.lang.IncompatibleClassChangeError: Implementing 
class
 at java.lang.ClassLoader.defineClass1(Native Method)
 at java.lang.ClassLoader.defineClass(ClassLoader.java:760)
 at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
 at java.net.URLClassLoader.defineClas

Resetting Node Statistics

2015-01-16 Thread Darren McDaniel
Short of restarting the node.. Is there any thought of giving a user the 
ability to reset the node statistics manually?

For example,  we're running our cluster in an ESX environment, and have 
recently VMotion'd the data to all SSD.  

We'd like to be able to reset all the stats for all the nodes without 
restarting the cluster.

Thanks!

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/d2711d4f-ed13-4f00-b926-69f8494cd996%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Urgent Requirement || Asp. Net developer@Dallas, TX

2015-01-16 Thread Anurag Singh
Hello,
Hope you are doing well. Looking for a asp.net developer. Given below is
the job description. Kindly response my mailing me or contacting on my
number given in last.

Job Description:

Title: Asp. Net developer

Location: Dallas, TX

Duration: 12 Months contract





6-8 Year of proven experience with Microsoft .NET technologies including
C#.NET 4.x, ADO.NET , MVC 4.x /5.x,  Web API
Strong hands on knowledge in C#, SQL/T-SQL, JavaScript/DHTML, VBScript,
HTML, XML
Minimum 3-5 years in backend software design in SQL Server 2000 or 2005,
Stored procedures
UI skills: HTML, Java Script, CSS, Jquery.
Experience with one or more newer frameworks like: Angular, Ember, React
Experience developing websites using a Content Management System (CMS)

-- 

Thanks & Regards*,*



Anurag Singh

Phone: 609-897-9670 x 2188

Email: anurag.sysm...@gmail.com

Fax: 609-228-5522

Address: 38 Washington Road, Princeton Jn, NJ 08550

[image: cid:image001.png@01CEC4D8.49178020]

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAPz5vzv2NBhc3gwfQdHUyiEyH%3DacWPn3vHtV%3D_HRa0Ad3UqoMA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: MySQL JDBC river: indexing large arrays

2015-01-16 Thread joergpra...@gmail.com
I will take a look, but this merging is exactly why I wrote the row
processing routine in JDBC plugin.

_id of ES doc can not be used, the uniqueness must be provided by SQL
result.

Jörg

On Fri, Jan 16, 2015 at 2:20 PM, Stalinko  wrote:

> Yeah (
>
> The best solution for me here would be loading the data in 3 steps:
>
> 1) SELECT id as _id, * FROM movies
>
> 2) SELECT mp.movie_id as _id, p.name
> FROM persons p
> JOIN movies_persons mp ON mp.person_id = p.id
>
> 3) SELECT mt.movie_id as _id, t.value
> FROM tags t
> JOIN movies_tags mt ON mt.tag_id = t.id
>
> but with updating/merging instead of rewriting. It could be achieved using
> unique "_id" key.
> Isn't that possible with ES? I think it would be an interesting option for
> the next versions of the plugin.
>
>
>
> --
> View this message in context:
> http://elasticsearch-users.115913.n3.nabble.com/MySQL-JDBC-river-indexing-large-arrays-tp4069168p4069184.html
> Sent from the ElasticSearch Users mailing list archive at Nabble.com.
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/1421414457784-4069184.post%40n3.nabble.com
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGNHpzAAnfuE_bk-NXCz6s2VahhwHJpmABEuA52H0r7ng%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Missing closed indexes after adding node

2015-01-16 Thread Russell Butturini
Hey guys,

New to Elasticsearch but already a huge fan.  I had a strange incident 
happen I was hoping you could provide some insight on.  

I set up a single Elastic search node for a project and collected some data 
in it for a few days to make sure everything was working correctly. No 
issues.  I went through and closed those indexes from my testing, and added 
a second node to the cluster.  When I did that...POOFAll the closed 
indexes disappeared.  No big deal to me, but I can see the disk space is 
still being used by those indexes.  They didn't replicate to the second 
node (the two VMs are identical) because the disk space usage over there is 
much lower.  I don't care about the data, is there any way I can either 
recover the indexes and properly purge them, or just remove them from disk 
by some other method? I don't really care about the data, but would to get 
the space back.  

Thanks in advance for any help!

-Russell

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/89a3ebcd-4dde-4d16-8458-2bb372b2870f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Shards not replicating to two nodes

2015-01-16 Thread Stewart Gray
Turns out it was a versioning issue - we were running slightly different 
versions of elasticsearch on the two not replicating. I upgraded these and 
now it's working wonderfully.

Thanks for your help.

Cheers

On Friday, 16 January 2015 12:40:54 UTC, Stewart Gray wrote:
>
> Hi, 
>
> Plenty of disk space on the two machines not receiving shards - there is a 
> symbolic link from /var/lib/elasticsearch to a 400gb drive on each. If I 
> get logstash to log to elasticsearch on one of these, and remove it from 
> the cluster I see the indicies being created in the right place.
>
> The configuration is exactly the same on all machines (apart from the 
> obvious stuff, node name, and other node IP addresses).
>
> Cheers
>
> On Friday, 16 January 2015 12:28:52 UTC, vineeth mohan wrote:
>>
>> Hi ,
>>
>> Can you check the following - 
>>
>>1. Disk space on the machine where shards are not getting used. 
>>2. See the configuration of the node in those machines. See if the 
>>client node only config is enabled in the config.
>>
>> Thanks
>>  Vineeth
>>
>> On Fri, Jan 16, 2015 at 5:20 PM, Stewart Gray  
>> wrote:
>>
>>> I have a new elasticsearch cluster with 4 nodes and for some reason I 
>>> can't get the shards distributed across all of these. Shards are only being 
>>> copied to 2/4 rather than all of them which is what I'm after.
>>>
>>> The config is the same on each:
>>>
>>> cluster.name: elasticsearch
>>> node.name: "x"
>>> gateway.expected_nodes: 4
>>> discovery.zen.minimum_master_nodes: 3
>>> discovery.zen.ping.multicast.enabled: false
>>> discovery.zen.ping.unicast.hosts: ["1.1.1.1", "1.1.1.2", "1.1.1.3"]
>>>
>>> The rest is default. Any thoughts?
>>>
>>>
>>> 
>>>
>>>  -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to elasticsearc...@googlegroups.com.
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/elasticsearch/9ce6e841-2a03-46be-8e63-3590f752d6bf%40googlegroups.com
>>>  
>>> 
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/fc277656-5f8d-46c4-9cd0-057a9d2da630%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: MySQL JDBC river: indexing large arrays

2015-01-16 Thread Stalinko
Yeah (

The best solution for me here would be loading the data in 3 steps:

1) SELECT id as _id, * FROM movies 

2) SELECT mp.movie_id as _id, p.name 
FROM persons p
JOIN movies_persons mp ON mp.person_id = p.id

3) SELECT mt.movie_id as _id, t.value 
FROM tags t
JOIN movies_tags mt ON mt.tag_id = t.id

but with updating/merging instead of rewriting. It could be achieved using
unique "_id" key.
Isn't that possible with ES? I think it would be an interesting option for
the next versions of the plugin.



--
View this message in context: 
http://elasticsearch-users.115913.n3.nabble.com/MySQL-JDBC-river-indexing-large-arrays-tp4069168p4069184.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1421414457784-4069184.post%40n3.nabble.com.
For more options, visit https://groups.google.com/d/optout.


Re: MySQL JDBC river: indexing large arrays

2015-01-16 Thread joergpra...@gmail.com
If you don't want to denormalize data, yes.

Jörg

On Fri, Jan 16, 2015 at 1:59 PM, Stalinko  wrote:

> Do you mean creating separate index per each table and then joining them
> somehow when searching?
>
>
>
>
> --
> View this message in context:
> http://elasticsearch-users.115913.n3.nabble.com/MySQL-JDBC-river-indexing-large-arrays-tp4069168p4069182.html
> Sent from the ElasticSearch Users mailing list archive at Nabble.com.
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/1421413172169-4069182.post%40n3.nabble.com
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFkgcK_zwKgX6U9mDHzMfr2JXyYXnvGqWNY2mmN6yHspA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: MySQL JDBC river: indexing large arrays

2015-01-16 Thread Stalinko
Do you mean creating separate index per each table and then joining them
somehow when searching?




--
View this message in context: 
http://elasticsearch-users.115913.n3.nabble.com/MySQL-JDBC-river-indexing-large-arrays-tp4069168p4069182.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1421413172169-4069182.post%40n3.nabble.com.
For more options, visit https://groups.google.com/d/optout.


Re: Writing custom scripts for indexing data in Elasticsearch

2015-01-16 Thread joergpra...@gmail.com
"schedule" is triggering the JDBC plugin by wall clock time of the machine,
where "interval" simply waits the given time period between two runs.

Jörg

On Fri, Jan 16, 2015 at 11:12 AM, Amtul Nazneen 
wrote:

> Thank you. Is it the "interval" parameter or "schedule" parameter? If I
> set the schedule parameter, then the Elasticsearch will poll the tables
> accordingly right?
>
> On Wednesday, January 14, 2015 at 2:31:07 PM UTC+5:30, David Pilato wrote:
>>
>> I guess you need to set interval. See doc plugin on the home page of the
>> JDBC river.
>>
>> interval - a time value for the delay between two river runs (default:
>> not set)
>>
>> --
>> David ;-)
>> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
>>
>> Le 14 janv. 2015 à 06:01, Amtul Nazneen  a écrit :
>>
>> Ohkay. So the river runs only once when the script starts? And after that
>> won't it be running in the background to fetch the updates according to a
>> schedule?
>>
>> On Monday, January 12, 2015 at 1:23:08 PM UTC+5:30, Ed Kim wrote:
>>>
>>> It executes once. You could consider running that script on a schedule
>>> and doing incremental updates using timestamps.
>>>
>>> On Sunday, January 11, 2015 at 9:24:28 PM UTC-8, Amtul Nazneen wrote:

 Thank you. I have a doubt though, once I run the script, the river
 plugin is started and the data gets indexed into Elasticsearch, I want to
 know, if the plugin would be running after that, or does it stop once the
 script execution comes to an end?


>  --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to elasticsearc...@googlegroups.com.
>> To view this discussion on the web visit https://groups.google.com/d/
>> msgid/elasticsearch/a0997950-4ca4-4036-9550-d1da3816b503%
>> 40googlegroups.com
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/9db75940-2ca3-4140-a681-cba55ac3725a%40googlegroups.com
> 
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFXKhuNY54R4UAVAw%2Bo8eYVshQiqmuQgHaPjY7qd2f6Zw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


How to Limit Search With-In Selected Document ID or Document ID List

2015-01-16 Thread ATL
Use Case: If we have 10 Million documents index on single server in Elastic 
Search in single index, If user want to search but limit search result in 
given 1 Million Doc ID filter so ES only return result within given 1 
million Doc ID filter

When apply search - divide 1 million Doc in 10K doc ID list batches and 
apply with search and loop through all batches On each result return - keep 
merge the result and send one merge result to end user in website


Is there any other better way to do search which get faster result and no 
need do batch search etc if we want to limit search result with in selected 
1 million Doc ID

Thanks
-V


-- 


*Notice of Confidentiality*

*This email message and its attachments (if any) are intended solely for 
the use of the addressees hereof. In addition, this message and any 
attachments may contain information that is confidential, privileged and 
exempt from disclosure under applicable law.  If you are not the intended 
recipient of this message, you are prohibited from reading, disclosing, 
reproducing, distributing, disseminating or otherwise using this 
transmission. Delivery of this message to any person other than the 
intended recipient is not intended to waive any right or privilege.  If you 
have received this message in error, please promptly notify the sender by 
reply email and immediately delete this message from your system.*

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/7a0bf27c-f36f-48f8-942c-571158633ac1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: MySQL JDBC river: indexing large arrays

2015-01-16 Thread joergpra...@gmail.com
You can index each table, and after query, you can set up a second multi
get operation to look up the other documents by their ID.

Jörg

On Fri, Jan 16, 2015 at 1:51 PM, Stalinko  wrote:

> Nope, that's not the case.
> My SQL was a psedo-code to demonstrate the logic.
>
> In reality it looks more like:
>
> SELECT movies.*, persons.name, tags.value
> FROM movies m
> JOIN movies_persons mp ON mp.movie_id = m.id
> JOIN persons p ON mp.person_id = p.id
> JOIN movies_tags mt ON mt.movie_id = m.id
> JOIN tags t ON mt.tag_id = t.id
>
> And yes, there is big number of rows anyway because each movie has many
> persons and tags. There can be 1 very simple tags, but each tag cause
> duplicating all the movie record in the result and everything slows down...
> That's the problem.
>
>
>
> --
> View this message in context:
> http://elasticsearch-users.115913.n3.nabble.com/MySQL-JDBC-river-indexing-large-arrays-tp4069168p4069178.html
> Sent from the ElasticSearch Users mailing list archive at Nabble.com.
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/1421412683551-4069178.post%40n3.nabble.com
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoE8uag9KJwP0TZM7Un4%3D0Vyc6f5fbrye47FzvME%2BkpN9A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: MySQL JDBC river: indexing large arrays

2015-01-16 Thread Stalinko
Nope, that's not the case.
My SQL was a psedo-code to demonstrate the logic.

In reality it looks more like:

SELECT movies.*, persons.name, tags.value
FROM movies m
JOIN movies_persons mp ON mp.movie_id = m.id
JOIN persons p ON mp.person_id = p.id
JOIN movies_tags mt ON mt.movie_id = m.id
JOIN tags t ON mt.tag_id = t.id

And yes, there is big number of rows anyway because each movie has many
persons and tags. There can be 1 very simple tags, but each tag cause
duplicating all the movie record in the result and everything slows down...
That's the problem.



--
View this message in context: 
http://elasticsearch-users.115913.n3.nabble.com/MySQL-JDBC-river-indexing-large-arrays-tp4069168p4069178.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1421412683551-4069178.post%40n3.nabble.com.
For more options, visit https://groups.google.com/d/optout.


Re: MySQL JDBC river: indexing large arrays

2015-01-16 Thread joergpra...@gmail.com
You should rethink your SQL statement because JOIN does not work like this,
the result will be an exponential number of rows.

You have to define conditions so that rows of given tables match.

For example

SELECT movies.*, persons.name, tags.value
FROM movies m
JOIN persons on persons.id = movies
JOIN tags on tags.id = movies

Jörg

On Fri, Jan 16, 2015 at 11:34 AM, Stalinko  wrote:

> I'm trying to index my movie DB into ES using MySQL JDBC river.
>
> The problem is:
> there are 3 tables:
> movies - has many columns
> persons - names of the people who participated in some movie
> tags - movie tags
>
> I'm indexing it using such query (it's not exact query, just pseudo-code to
> explain the problem):
>
> SELECT movies.*, persons.name, tags.value
> FROM movies m
> JOIN persons
> JOIN tags
>
> There are quite many movies, each has many columns and each of movies has
> something like 10-30 persons and 1000 tags as well.
> Thus because of the joins all the movies data is duplicated 10.000-30.000
> times in the resulting set.
> That leads to a great overload, one indexation takes more than hour, but I
> need to re-index the data each day.
>
> Is there a way to index arrays without duplicating all the data?
> I tried it in that way - split the query into 3 ones:
>
> SELECT id as _id, * FROM movies
> SELECT movie_id as _id, name FROM persons
> SELECT movie_id as _id, value FROM tags
>
> But these queries overwrite each other instead of updating.
>
> Can anybody help me? Looks like the plugin wasn't designed for such cases
> and I need to write my own strategy (however I don't write in Java :()
>
>
>
>
> --
> View this message in context:
> http://elasticsearch-users.115913.n3.nabble.com/MySQL-JDBC-river-indexing-large-arrays-tp4069168.html
> Sent from the ElasticSearch Users mailing list archive at Nabble.com.
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/1421404458605-4069168.post%40n3.nabble.com
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHdYnaZ1Q3scosmhDJw%2BqKhLhUdKjhWcd8MifzG6_csDA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elasticsearch JDBC river plugin- Interval vs Schedule.

2015-01-16 Thread joergpra...@gmail.com
"schedule" is synchronized with the wall clock time of the system.
"interval" is the elapsing time between two runs, the plugin simply waits
that period.

Jörg

On Fri, Jan 16, 2015 at 11:18 AM, 4m7u1  wrote:

>
> In JDBC river plugin for Elasticsearch, what is the difference between
> "interval" and "schedule"?
> Both are used to keep Elasticsearch in sync with the database.
> Which one is better?
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/8d4069fb-ed3a-4b41-b917-d262366a99c8%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFd_L0ib%3DwhPmWNt7ebnoPbeNa-y6PTwVyPoHTXgYOCJA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Shards not replicating to two nodes

2015-01-16 Thread Stewart Gray
Hi, 

Plenty of disk space on the two machines not receiving shards - there is a 
symbolic link from /var/lib/elasticsearch to a 400gb drive on each. If I 
get logstash to log to elasticsearch on one of these, and remove it from 
the cluster I see the indicies being created in the right place.

The configuration is exactly the same on all machines (apart from the 
obvious stuff, node name, and other node IP addresses).

Cheers

On Friday, 16 January 2015 12:28:52 UTC, vineeth mohan wrote:
>
> Hi ,
>
> Can you check the following - 
>
>1. Disk space on the machine where shards are not getting used. 
>2. See the configuration of the node in those machines. See if the 
>client node only config is enabled in the config.
>
> Thanks
>  Vineeth
>
> On Fri, Jan 16, 2015 at 5:20 PM, Stewart Gray  > wrote:
>
>> I have a new elasticsearch cluster with 4 nodes and for some reason I 
>> can't get the shards distributed across all of these. Shards are only being 
>> copied to 2/4 rather than all of them which is what I'm after.
>>
>> The config is the same on each:
>>
>> cluster.name: elasticsearch
>> node.name: "x"
>> gateway.expected_nodes: 4
>> discovery.zen.minimum_master_nodes: 3
>> discovery.zen.ping.multicast.enabled: false
>> discovery.zen.ping.unicast.hosts: ["1.1.1.1", "1.1.1.2", "1.1.1.3"]
>>
>> The rest is default. Any thoughts?
>>
>>
>> 
>>
>>  -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/9ce6e841-2a03-46be-8e63-3590f752d6bf%40googlegroups.com
>>  
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/6fd7093c-b48f-4a90-918b-5d06b5b2ffb7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elsticsearch JDBC river plugin metrics

2015-01-16 Thread joergpra...@gmail.com
These are diagnostic messages which have been crept into one of the
releases. Latest version has metrics logging disabled, it must be enabled
by settings.

The metrics count the number of rows fetched form the database, and prints
them at every minute. This is not the number of documents in ES.

The metrics print an average mean of the row count, so you can see that
your database sent 250 rows per second. It also counts the data volume in
bytes, and print the measure of megabytes per second, which is in
interesting number for throughput.

Jörg

On Fri, Jan 16, 2015 at 12:16 PM, 4m7u1  wrote:

> I'm trying to run a river query which fetches 1 records scheduled at 1
> minute interval. The first time it runs, metrics is 1 rows and after a
> gap of 1 minute that is (scheduled interval) metrics is 2 rows. What
> does this mean? Although the number of hits i get on querying the river
> index is 1 itself. Why do the metrics on rows keep on increasing by a
> factor of 1 each time?
>
> *[2015-01-16 16:38:03,406][INFO ][river.jdbc.RiverMetrics  ] pipeline
> org.xbib.elasticsearch.plugin.jdbc.RiverPipeline@20f4b6fe complete: river
> jdbc/my_jdbc_river metrics: 1 rows, 250.5428448620193 mean, (0.0 0.0
> 0.0), ingest metrics: elapsed 3 seconds, 3.37 MB bytes, 352.0 bytes avg, 1
> MB/s*
>
> and also can anyone explain the below values?
>
> metrics: 1 rows, 250.5428448620193 mean, (0.0 0.0 0.0),
> ingest metrics: elapsed 3 seconds, 3.37 MB bytes, 352.0 bytes avg, 1 MB/s.
>
> Thank you.
>
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/a043ffe1-976d-4c07-bca6-a7ef93f14b3b%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHbGD7YigmdgmZv%2BC4_oe4m32iCg4-eHUKXf4S0urpXcQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


ElasticSearch dynamic_date_formats is not working if there is a mapping in the index (even if empty)

2015-01-16 Thread Ipatios Asmanidis
Using version : 1.4.2
OS: Ubuntu Linux 14.04
JVM: 1.7

Here is how you can reproduce the issue ... you will see that in the first 
case the myfield properties is applied only ... and in the second 
everything looks normal ... 

https://gist.github.com/ypasmk/df441c24a7de0f8ceb69

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/43acfdf5-297c-43bd-91e5-45488216c369%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Shards not replicating to two nodes

2015-01-16 Thread vineeth mohan
Hi ,

Can you check the following -

   1. Disk space on the machine where shards are not getting used.
   2. See the configuration of the node in those machines. See if the
   client node only config is enabled in the config.

Thanks
 Vineeth

On Fri, Jan 16, 2015 at 5:20 PM, Stewart Gray 
wrote:

> I have a new elasticsearch cluster with 4 nodes and for some reason I
> can't get the shards distributed across all of these. Shards are only being
> copied to 2/4 rather than all of them which is what I'm after.
>
> The config is the same on each:
>
> cluster.name: elasticsearch
> node.name: "x"
> gateway.expected_nodes: 4
> discovery.zen.minimum_master_nodes: 3
> discovery.zen.ping.multicast.enabled: false
> discovery.zen.ping.unicast.hosts: ["1.1.1.1", "1.1.1.2", "1.1.1.3"]
>
> The rest is default. Any thoughts?
>
>
> 
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/9ce6e841-2a03-46be-8e63-3590f752d6bf%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAGdPd5%3D-ba5OTVj5DNmx5vg7ubv%3Dsw3r96%3DZZR7fQG58fbbizg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Slow Commands with 1.2.4 to 1.4.2 Upgrade

2015-01-16 Thread Arie
Check your memory usage on ES. In my case I had to specifically put 
ES_HEAP_SIZE in /etc/init.d/elasticsearch
to get it working the right way.

On Monday, January 5, 2015 at 7:45:12 PM UTC+1, psk...@gmail.com wrote:
>
> It takes upwards an average of 10 to 30 seconds. This is a test instance, 
> so there's no additional load other than what I'm doing.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/2c48b2b5-f2a1-41bd-b7f0-d4042fc147cc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Shards not replicating to two nodes

2015-01-16 Thread Stewart Gray


I have a new elasticsearch cluster with 4 nodes and for some reason I can't 
get the shards distributed across all of these. Shards are only being 
copied to 2/4 rather than all of them which is what I'm after.

The config is the same on each:

cluster.name: elasticsearch
node.name: "x"
gateway.expected_nodes: 4
discovery.zen.minimum_master_nodes: 3
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: ["1.1.1.1", "1.1.1.2", "1.1.1.3"]

The rest is default. Any thoughts?



-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/9ce6e841-2a03-46be-8e63-3590f752d6bf%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Elsticsearch JDBC river plugin metrics

2015-01-16 Thread 4m7u1
I'm trying to run a river query which fetches 1 records scheduled at 1 
minute interval. The first time it runs, metrics is 1 rows and after a 
gap of 1 minute that is (scheduled interval) metrics is 2 rows. What 
does this mean? Although the number of hits i get on querying the river 
index is 1 itself. Why do the metrics on rows keep on increasing by a 
factor of 1 each time? 

*[2015-01-16 16:38:03,406][INFO ][river.jdbc.RiverMetrics  ] pipeline 
org.xbib.elasticsearch.plugin.jdbc.RiverPipeline@20f4b6fe complete: river 
jdbc/my_jdbc_river metrics: 1 rows, 250.5428448620193 mean, (0.0 0.0 
0.0), ingest metrics: elapsed 3 seconds, 3.37 MB bytes, 352.0 bytes avg, 1 
MB/s*

and also can anyone explain the below values?

metrics: 1 rows, 250.5428448620193 mean, (0.0 0.0 0.0), 
ingest metrics: elapsed 3 seconds, 3.37 MB bytes, 352.0 bytes avg, 1 MB/s.

Thank you.


-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a043ffe1-976d-4c07-bca6-a7ef93f14b3b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


MySQL JDBC river: indexing large arrays

2015-01-16 Thread Stalinko
I'm trying to index my movie DB into ES using MySQL JDBC river.

The problem is:
there are 3 tables:
movies - has many columns
persons - names of the people who participated in some movie
tags - movie tags

I'm indexing it using such query (it's not exact query, just pseudo-code to
explain the problem):

SELECT movies.*, persons.name, tags.value
FROM movies m
JOIN persons
JOIN tags

There are quite many movies, each has many columns and each of movies has
something like 10-30 persons and 1000 tags as well.
Thus because of the joins all the movies data is duplicated 10.000-30.000
times in the resulting set.
That leads to a great overload, one indexation takes more than hour, but I
need to re-index the data each day.

Is there a way to index arrays without duplicating all the data?
I tried it in that way - split the query into 3 ones:

SELECT id as _id, * FROM movies
SELECT movie_id as _id, name FROM persons
SELECT movie_id as _id, value FROM tags

But these queries overwrite each other instead of updating.

Can anybody help me? Looks like the plugin wasn't designed for such cases
and I need to write my own strategy (however I don't write in Java :()




--
View this message in context: 
http://elasticsearch-users.115913.n3.nabble.com/MySQL-JDBC-river-indexing-large-arrays-tp4069168.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1421404458605-4069168.post%40n3.nabble.com.
For more options, visit https://groups.google.com/d/optout.


Re: How to emit console.log output when using javascript-lang for custom scripts?

2015-01-16 Thread vineeth mohan
Hello Tim ,

Its very much possible from groovy , by accessing the Elasticsearch logger
instance.
You should be able to do something similar for javascript too.
Some good folks from Elasticsearch has given a good documentation on it
against my issue.

*import  org.elasticsearch.common.logging.*; *

*ESLogger logger=ESLoggerFactory.getLogger('myscript'); *

*logger.info ('This is a log message'); *


LINK - https://github.com/elasticsearch/elasticsearch/issues/9068

Thanks
   Vineeth Mohan,
 Elasticsearch consultant,
 qbox.io ( Elasticsearch service provider )

On Wed, Dec 24, 2014 at 12:13 AM, Tim Heckel  wrote:

> Hi all -- I'd like to emit to stdout information within my custom .js
> script (located in /elasticsearch/1.4.2/config/scripts/test.js).
>
> I've tried both console.log and print (just a guess), but neither work:
>
> "error": "ElasticsearchIllegalArgumentException[failed to execute script];
> nested: EcmaError[ReferenceError: \"console\" is not defined.
> (Script2.js#19)]; "
> "error": "ElasticsearchIllegalArgumentException[failed to execute script];
> nested: EcmaError[ReferenceError: \"print\" is not defined.
> (Script1.js#19)]; "
>
> Any way for me to hook into stdout to emit logging information this way?
> Thanks very much.
>
> Tim
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/ea610adb-986c-4de2-a429-6fb03b45a640%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAGdPd5mbn3ZRGpCk2Ey3bnJ-JQMXyW_J8ObGHnL8ef3N7CG_7Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Elasticsearch JDBC river plugin- Interval vs Schedule.

2015-01-16 Thread 4m7u1

In JDBC river plugin for Elasticsearch, what is the difference between 
"interval" and "schedule"?
Both are used to keep Elasticsearch in sync with the database. 
Which one is better?

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/8d4069fb-ed3a-4b41-b917-d262366a99c8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


ELasticsearch: Interval vs. Schedule

2015-01-16 Thread 4m7u1
Hi !

What is the difference between interval and schedule? Both are used to keep 
Elasticsearch in sync with the database. Which one is better?

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/2030b825-5f54-4606-8a6d-166bfd5e8e7c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Writing custom scripts for indexing data in Elasticsearch

2015-01-16 Thread Amtul Nazneen
Thank you. Is it the "interval" parameter or "schedule" parameter? If I set 
the schedule parameter, then the Elasticsearch will poll the tables 
accordingly right?

On Wednesday, January 14, 2015 at 2:31:07 PM UTC+5:30, David Pilato wrote:
>
> I guess you need to set interval. See doc plugin on the home page of the 
> JDBC river.
>
> interval - a time value for the delay between two river runs (default: 
> not set)
>
> --
> David ;-)
> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
>
> Le 14 janv. 2015 à 06:01, Amtul Nazneen > 
> a écrit :
>
> Ohkay. So the river runs only once when the script starts? And after that 
> won't it be running in the background to fetch the updates according to a 
> schedule?
>
> On Monday, January 12, 2015 at 1:23:08 PM UTC+5:30, Ed Kim wrote:
>>
>> It executes once. You could consider running that script on a schedule 
>> and doing incremental updates using timestamps. 
>>
>> On Sunday, January 11, 2015 at 9:24:28 PM UTC-8, Amtul Nazneen wrote:
>>>
>>> Thank you. I have a doubt though, once I run the script, the river 
>>> plugin is started and the data gets indexed into Elasticsearch, I want to 
>>> know, if the plugin would be running after that, or does it stop once the 
>>> script execution comes to an end?
>>>
>>>
  -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearc...@googlegroups.com .
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/a0997950-4ca4-4036-9550-d1da3816b503%40googlegroups.com
>  
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/9db75940-2ca3-4140-a681-cba55ac3725a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Aggregation - Blank and date aggregation

2015-01-16 Thread buddarapu nagaraju
I was able to figure out through fiddler ...date histrograms are returns in
seperate nested object in the result .. Now works

On Friday, January 16, 2015, Adrien Grand 
wrote:

> This looks good, what error did you get?
>
> On Fri, Jan 16, 2015 at 9:41 AM, buddarapu nagaraju  > wrote:
>
>> Index mapping here
>>
>> "mappings": {
>>
>>- "document": {
>>   - "properties": {
>>  - "createdDateTime": {
>> - "format": "dateOptionalTime",
>> - "type": "date"
>>  },
>>  - "doubleSort1": {
>> - "type": "double"
>>  },
>>  - "stringSort3": {
>> - "type": "string"
>>  },
>>  - "doubleSort2": {
>> - "type": "double"
>>  },
>>  - "doubleSort3": {
>> - "type": "double"
>>  },
>>  - "numSort1": {
>> - "type": "long"
>>  },
>>  - "stringSort2": {
>> - "type": "string"
>>  },
>>  - "dcn": {
>> - "type": "string"
>>  },
>>  - "numSort2": {
>> - "type": "long"
>>  },
>>  - "numSort3": {
>> - "type": "long"
>>  },
>>  - "path": {
>> - "type": "string"
>>  },
>>  - "numField": {
>> - "type": "long"
>>  },
>>  - "dateSort3": {
>> - "format": "dateOptionalTime",
>> - "type": "date"
>>  },
>>  - "dateSort2": {
>> - "format": "dateOptionalTime",
>> - "type": "date"
>>  },
>>  - "rank": {
>> - "type": "double"
>>  },
>>  - "id": {
>> - "type": "long"
>>  },
>>  - "text": {
>> - "type": "string"
>>  },
>>  - "fields": {
>> - "properties": {
>>- "isAnalyzed": {
>>   - "type": "boolean"
>>},
>>- "name": {
>>   - "type": "string"
>>},
>>- "isFullText": {
>>   - "type": "boolean"
>>},
>>- "isStored": {
>>   - "type": "boolean"
>>},
>>- "value": {
>>   - "type": "string"
>>}
>> }
>>  }
>>   }
>>}
>>
>>
>> Regards
>> Nagaraju
>> 908 517 6981
>>
>> On Fri, Jan 16, 2015 at 3:23 AM, buddarapu nagaraju <
>> budda08n...@gmail.com
>> > wrote:
>>
>>> Hi ,
>>>
>>> I tried but date histrogram didnt work not sure what is the mistake am
>>> doing
>>>
>>> here is date histrogram request(json) am passing and also pasted sample
>>> doc structure
>>>
>>>
>>>
>>>
>>>
>>> date histogram request
>>>
>>> {
>>>   "aggs": {
>>> "createddatetime": {
>>>   "date_histogram": {
>>> "field": "createddatetime",
>>> "interval": "day"
>>>   }
>>> }
>>>   }
>>> }
>>>
>>> Document in index has fields
>>>
>>>
>>>
>>>-
>>>   - "id": 79,
>>>   - "rank": 0,
>>>   - "dateSort2": "2015-01-15T06:08:06.7091884Z",
>>>   - "dateSort3": "0001-01-01T00:00:00",
>>>   - "doubleSort1": 118.5,
>>>   - "doubleSort2": 67884.18,
>>>   - "doubleSort3": 54262.6006,
>>>   - "numField": 0,
>>>   - "createdDateTime": "2015-01-16T06:08:06.7091884Z",
>>>   -
>>>
>>>
>>> Regards
>>> Nagaraju
>>> 908 517 6981
>>>
>>> On Thu, Jan 15, 2015 at 12:38 PM, Adrien Grand <
>>> adrien.gr...@elasticsearch.com
>>> > wrote:
>>>
 Then it means that you want to use a date_histogram aggregation with
 interval=day. See
 http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-bucket-datehistogram-aggregation.html

 On Thu, Jan 15, 2015 at 4:43 PM, buddarapu nagaraju <
 budda08n...@gmail.com
 > wrote:

> Hey Adrien ,Thank you.I have one more question on aggregating on dates
> .
>
> We actually stored date time in a field called "createdDateTime" but I
> need only aggregates on date part of date time .
>
> Any ideas ? Or sample code  can help us ?
>
> Regards
> Nagaraju
> 908 517 6981
>
> On Wed, Jan 14, 2015 at 6:10 AM, Adrien Grand <
> adrien.gr...@elasticsearch.com
> >
> wrote:
>
>>
>>
>> On Wed, Jan 14, 2015 at 10:37 AM, buddarapu nagaraju <
>> budda08n...@gmail.com
>> > wrote:
>>
>>> Does term aggregation counts on blank field values ?
>>>
>>>
>> Yes, an empty value "" counts as a term. Note that you need the field
>> to be not analyzed for it to work (or to use an analyzer that emits empty
>> strings). Otherwise the standard analyzer would analyzer "" as an empty
>> list of tokens, so a field value of "" would not actually count...
>>
>>
>>> Does term aggregation is enough for doing date aggregation ? Or
>>> there any specific aggregat

scrolling and lucene segments

2015-01-16 Thread Jason Wee
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-scroll.html

Normally, the background merge process optimizes the index by merging 
together smaller segments to create new bigger segments, at which time the 
smaller segments are deleted. This process continues during scrolling, but 
an open search context prevents the old segments from being deleted while 
they are still in use. This is how Elasticsearch is able to return the 
results of the initial search request, regardless of subsequent changes to 
documents.

Tip
Keeping older segments alive means that more file handles are needed. 
Ensure that you have configured your nodes to have ample free file handles. 
See the section called “File Descriptorsedit”.

Hello,

Read the above description, can anyone tell what happened to the segments 
after the scroll time expired? Does the segments will automatically merge? 
What if a lot (like 50 active) of scroll happened and how will it impact 
the lucene segment/elasticsearch? comments?

Jason

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/2971e92d-397f-4d4b-a6da-0428a6cde638%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Aggregation - Blank and date aggregation

2015-01-16 Thread Adrien Grand
This looks good, what error did you get?

On Fri, Jan 16, 2015 at 9:41 AM, buddarapu nagaraju 
wrote:

> Index mapping here
>
> "mappings": {
>
>- "document": {
>   - "properties": {
>  - "createdDateTime": {
> - "format": "dateOptionalTime",
> - "type": "date"
>  },
>  - "doubleSort1": {
> - "type": "double"
>  },
>  - "stringSort3": {
> - "type": "string"
>  },
>  - "doubleSort2": {
> - "type": "double"
>  },
>  - "doubleSort3": {
> - "type": "double"
>  },
>  - "numSort1": {
> - "type": "long"
>  },
>  - "stringSort2": {
> - "type": "string"
>  },
>  - "dcn": {
> - "type": "string"
>  },
>  - "numSort2": {
> - "type": "long"
>  },
>  - "numSort3": {
> - "type": "long"
>  },
>  - "path": {
> - "type": "string"
>  },
>  - "numField": {
> - "type": "long"
>  },
>  - "dateSort3": {
> - "format": "dateOptionalTime",
> - "type": "date"
>  },
>  - "dateSort2": {
> - "format": "dateOptionalTime",
> - "type": "date"
>  },
>  - "rank": {
> - "type": "double"
>  },
>  - "id": {
> - "type": "long"
>  },
>  - "text": {
> - "type": "string"
>  },
>  - "fields": {
> - "properties": {
>- "isAnalyzed": {
>   - "type": "boolean"
>},
>- "name": {
>   - "type": "string"
>},
>- "isFullText": {
>   - "type": "boolean"
>},
>- "isStored": {
>   - "type": "boolean"
>},
>- "value": {
>   - "type": "string"
>}
> }
>  }
>   }
>}
>
>
> Regards
> Nagaraju
> 908 517 6981
>
> On Fri, Jan 16, 2015 at 3:23 AM, buddarapu nagaraju  > wrote:
>
>> Hi ,
>>
>> I tried but date histrogram didnt work not sure what is the mistake am
>> doing
>>
>> here is date histrogram request(json) am passing and also pasted sample
>> doc structure
>>
>>
>>
>>
>>
>> date histogram request
>>
>> {
>>   "aggs": {
>> "createddatetime": {
>>   "date_histogram": {
>> "field": "createddatetime",
>> "interval": "day"
>>   }
>> }
>>   }
>> }
>>
>> Document in index has fields
>>
>>
>>
>>-
>>   - "id": 79,
>>   - "rank": 0,
>>   - "dateSort2": "2015-01-15T06:08:06.7091884Z",
>>   - "dateSort3": "0001-01-01T00:00:00",
>>   - "doubleSort1": 118.5,
>>   - "doubleSort2": 67884.18,
>>   - "doubleSort3": 54262.6006,
>>   - "numField": 0,
>>   - "createdDateTime": "2015-01-16T06:08:06.7091884Z",
>>   -
>>
>>
>> Regards
>> Nagaraju
>> 908 517 6981
>>
>> On Thu, Jan 15, 2015 at 12:38 PM, Adrien Grand <
>> adrien.gr...@elasticsearch.com> wrote:
>>
>>> Then it means that you want to use a date_histogram aggregation with
>>> interval=day. See
>>> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-bucket-datehistogram-aggregation.html
>>>
>>> On Thu, Jan 15, 2015 at 4:43 PM, buddarapu nagaraju <
>>> budda08n...@gmail.com> wrote:
>>>
 Hey Adrien ,Thank you.I have one more question on aggregating on dates .

 We actually stored date time in a field called "createdDateTime" but I
 need only aggregates on date part of date time .

 Any ideas ? Or sample code  can help us ?

 Regards
 Nagaraju
 908 517 6981

 On Wed, Jan 14, 2015 at 6:10 AM, Adrien Grand <
 adrien.gr...@elasticsearch.com> wrote:

>
>
> On Wed, Jan 14, 2015 at 10:37 AM, buddarapu nagaraju <
> budda08n...@gmail.com> wrote:
>
>> Does term aggregation counts on blank field values ?
>>
>>
> Yes, an empty value "" counts as a term. Note that you need the field
> to be not analyzed for it to work (or to use an analyzer that emits empty
> strings). Otherwise the standard analyzer would analyzer "" as an empty
> list of tokens, so a field value of "" would not actually count...
>
>
>> Does term aggregation is enough for doing date aggregation ? Or there
>> any specific aggregations we have ?All I need in date aggregation is to
>> know different dates and its counts ?
>>
>
> A terms aggregation is enough, but a date_histogram aggregation is
> generally more useful on dates as there are lots of unique values and it's
> often more useful to group them based on the year, month or day.
>
> --
> Adrien Grand
>
> --
> You received this message because you a

Re: Aggregation - Blank and date aggregation

2015-01-16 Thread buddarapu nagaraju
Index mapping here

"mappings": {

   - "document": {
  - "properties": {
 - "createdDateTime": {
- "format": "dateOptionalTime",
- "type": "date"
 },
 - "doubleSort1": {
- "type": "double"
 },
 - "stringSort3": {
- "type": "string"
 },
 - "doubleSort2": {
- "type": "double"
 },
 - "doubleSort3": {
- "type": "double"
 },
 - "numSort1": {
- "type": "long"
 },
 - "stringSort2": {
- "type": "string"
 },
 - "dcn": {
- "type": "string"
 },
 - "numSort2": {
- "type": "long"
 },
 - "numSort3": {
- "type": "long"
 },
 - "path": {
- "type": "string"
 },
 - "numField": {
- "type": "long"
 },
 - "dateSort3": {
- "format": "dateOptionalTime",
- "type": "date"
 },
 - "dateSort2": {
- "format": "dateOptionalTime",
- "type": "date"
 },
 - "rank": {
- "type": "double"
 },
 - "id": {
- "type": "long"
 },
 - "text": {
- "type": "string"
 },
 - "fields": {
- "properties": {
   - "isAnalyzed": {
  - "type": "boolean"
   },
   - "name": {
  - "type": "string"
   },
   - "isFullText": {
  - "type": "boolean"
   },
   - "isStored": {
  - "type": "boolean"
   },
   - "value": {
  - "type": "string"
   }
}
 }
  }
   }


Regards
Nagaraju
908 517 6981

On Fri, Jan 16, 2015 at 3:23 AM, buddarapu nagaraju 
wrote:

> Hi ,
>
> I tried but date histrogram didnt work not sure what is the mistake am
> doing
>
> here is date histrogram request(json) am passing and also pasted sample
> doc structure
>
>
>
>
>
> date histogram request
>
> {
>   "aggs": {
> "createddatetime": {
>   "date_histogram": {
> "field": "createddatetime",
> "interval": "day"
>   }
> }
>   }
> }
>
> Document in index has fields
>
>
>
>-
>   - "id": 79,
>   - "rank": 0,
>   - "dateSort2": "2015-01-15T06:08:06.7091884Z",
>   - "dateSort3": "0001-01-01T00:00:00",
>   - "doubleSort1": 118.5,
>   - "doubleSort2": 67884.18,
>   - "doubleSort3": 54262.6006,
>   - "numField": 0,
>   - "createdDateTime": "2015-01-16T06:08:06.7091884Z",
>   -
>
>
> Regards
> Nagaraju
> 908 517 6981
>
> On Thu, Jan 15, 2015 at 12:38 PM, Adrien Grand <
> adrien.gr...@elasticsearch.com> wrote:
>
>> Then it means that you want to use a date_histogram aggregation with
>> interval=day. See
>> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-bucket-datehistogram-aggregation.html
>>
>> On Thu, Jan 15, 2015 at 4:43 PM, buddarapu nagaraju <
>> budda08n...@gmail.com> wrote:
>>
>>> Hey Adrien ,Thank you.I have one more question on aggregating on dates .
>>>
>>> We actually stored date time in a field called "createdDateTime" but I
>>> need only aggregates on date part of date time .
>>>
>>> Any ideas ? Or sample code  can help us ?
>>>
>>> Regards
>>> Nagaraju
>>> 908 517 6981
>>>
>>> On Wed, Jan 14, 2015 at 6:10 AM, Adrien Grand <
>>> adrien.gr...@elasticsearch.com> wrote:
>>>


 On Wed, Jan 14, 2015 at 10:37 AM, buddarapu nagaraju <
 budda08n...@gmail.com> wrote:

> Does term aggregation counts on blank field values ?
>
>
 Yes, an empty value "" counts as a term. Note that you need the field
 to be not analyzed for it to work (or to use an analyzer that emits empty
 strings). Otherwise the standard analyzer would analyzer "" as an empty
 list of tokens, so a field value of "" would not actually count...


> Does term aggregation is enough for doing date aggregation ? Or there
> any specific aggregations we have ?All I need in date aggregation is to
> know different dates and its counts ?
>

 A terms aggregation is enough, but a date_histogram aggregation is
 generally more useful on dates as there are lots of unique values and it's
 often more useful to group them based on the year, month or day.

 --
 Adrien Grand

 --
 You received this message because you are subscribed to a topic in the
 Google Groups "elasticsearch" group.
 To unsubscribe from this topic, visit
 https://groups.google.com/d/topic/elasticsearch/i9N09n_-n38/unsubscribe
 .
 To unsubscribe from this group and all its topics, send an email to
 elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 h

Unable to write in ElasticSearch using Spark in java (throws java.lang.IncompatibleClassChangeError: Implementing class exception)

2015-01-16 Thread Abhishek Patel
I am using a simple Java program to index a spark JavaRDD into 
Elasticsearch. My code looks like this - 

SparkConf conf = new 
SparkConf().setAppName("IndexDemo").setMaster("spark://ct-0094:7077");
 
conf.set("spark.serializer", 
org.apache.spark.serializer.KryoSerializer.class.getName());
conf.set("es.index.auto.create", "true"); 
conf.set("es.nodes", "192.168.50.103");
conf.set("es.port", "9200");
JavaSparkContext sc = new JavaSparkContext(conf);

sc.addJar("./target/SparkPOC-0.0.1-SNAPSHOT-jar-with-dependencies.jar");

String arrayval = "string";
List data = Arrays.asList(
new Data(1l, 10l, arrayval+"1"),
new Data(2l, 20l, arrayval+"2"),
new Data(3l, 30l, arrayval+"3"),
new Data(4l, 40l, arrayval+"4"),
new Data(5l, 50l, arrayval+"5"),
new Data(6l, 60l, arrayval+"6"),
new Data(7l, 70l, arrayval+"7"),
new Data(8l, 80l, arrayval+"8"),
new Data(9l, 90l, arrayval+"9"),
new Data(10l, 100l, arrayval+"10")
);

JavaRDD javaRDD = sc.parallelize(data);
saveToEs(javaRDD, "index/type");

Running above codes gives an exception (Stack Trace)- 

>15/01/16 13:20:41 INFO spark.SecurityManager: Changing view acls to: root
15/01/16 13:20:41 INFO spark.SecurityManager: Changing modify acls to: root
15/01/16 13:20:41 INFO spark.SecurityManager: SecurityManager: 
authentication disabled; ui acls disabled; users with view permissions: 
Set(root); users with modify permissions: Set(root)
15/01/16 13:20:41 INFO slf4j.Slf4jLogger: Slf4jLogger started
15/01/16 13:20:41 INFO Remoting: Starting remoting
15/01/16 13:20:41 INFO Remoting: Remoting started; listening on addresses 
:[akka.tcp://sparkDriver@ct-0015:55586]
15/01/16 13:20:41 INFO util.Utils: Successfully started service 
'sparkDriver' on port 55586.
15/01/16 13:20:41 INFO spark.SparkEnv: Registering MapOutputTracker
15/01/16 13:20:41 INFO spark.SparkEnv: Registering BlockManagerMaster
15/01/16 13:20:41 INFO storage.DiskBlockManager: Created local directory at 
/tmp/spark-local-20150116132041-f924
15/01/16 13:20:41 INFO storage.MemoryStore: MemoryStore started with 
capacity 2.3 GB
15/01/16 13:20:41 WARN util.NativeCodeLoader: Unable to load native-hadoop 
library for your platform... using builtin-java classes where applicable
15/01/16 13:20:41 INFO spark.HttpFileServer: HTTP File server directory is 
/tmp/spark-a65b108f-e131-480a-85b2-ed65650cf991
15/01/16 13:20:42 INFO spark.HttpServer: Starting HTTP Server
15/01/16 13:20:42 INFO server.Server: jetty-8.1.14.v20131031
15/01/16 13:20:42 INFO server.AbstractConnector: Started 
SocketConnector@0.0.0.0:34049
15/01/16 13:20:42 INFO util.Utils: Successfully started service 'HTTP file 
server' on port 34049.
15/01/16 13:20:42 INFO server.Server: jetty-8.1.14.v20131031
15/01/16 13:20:42 INFO server.AbstractConnector: Started 
SelectChannelConnector@0.0.0.0:4040
15/01/16 13:20:42 INFO util.Utils: Successfully started service 'SparkUI' 
on port 4040.
15/01/16 13:20:42 INFO ui.SparkUI: Started SparkUI at http://ct-0015:4040
15/01/16 13:20:42 INFO client.AppClient$ClientActor: Connecting to master 
spark://ct-0094:7077...
15/01/16 13:20:42 INFO cluster.SparkDeploySchedulerBackend: Connected to 
Spark cluster with app ID app-20150116131933-0078
15/01/16 13:20:42 INFO netty.NettyBlockTransferService: Server created on 
34762
15/01/16 13:20:42 INFO storage.BlockManagerMaster: Trying to register 
BlockManager
15/01/16 13:20:42 INFO storage.BlockManagerMasterActor: Registering block 
manager ct-0015:34762 with 2.3 GB RAM, BlockManagerId(, ct-0015, 
34762)
15/01/16 13:20:42 INFO storage.BlockManagerMaster: Registered BlockManager
15/01/16 13:20:42 INFO cluster.SparkDeploySchedulerBackend: 
SchedulerBackend is ready for scheduling beginning after reached 
minRegisteredResourcesRatio: 0.0
15/01/16 13:20:43 INFO spark.SparkContext: Added JAR 
./target/SparkPOC-0.0.1-SNAPSHOT-jar-with-dependencies.jar at 
http://192.168.50.103:34049/jars/SparkPOC-0.0.1-SNAPSHOT-jar-with-dependencies.jar
 
with timestamp 1421394643161
Exception in thread "main" java.lang.IncompatibleClassChangeError: 
Implementing class
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:760)
at 
java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:455)
at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
at java.net.URLClassLoader$1.run(URLClassLoader.java:367)
at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:360)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader

Re: Aggregation - Blank and date aggregation

2015-01-16 Thread buddarapu nagaraju
Hi ,

I tried but date histrogram didnt work not sure what is the mistake am doing

here is date histrogram request(json) am passing and also pasted sample doc
structure





date histogram request

{
  "aggs": {
"createddatetime": {
  "date_histogram": {
"field": "createddatetime",
"interval": "day"
  }
}
  }
}

Document in index has fields



   -
  - "id": 79,
  - "rank": 0,
  - "dateSort2": "2015-01-15T06:08:06.7091884Z",
  - "dateSort3": "0001-01-01T00:00:00",
  - "doubleSort1": 118.5,
  - "doubleSort2": 67884.18,
  - "doubleSort3": 54262.6006,
  - "numField": 0,
  - "createdDateTime": "2015-01-16T06:08:06.7091884Z",
  -


Regards
Nagaraju
908 517 6981

On Thu, Jan 15, 2015 at 12:38 PM, Adrien Grand <
adrien.gr...@elasticsearch.com> wrote:

> Then it means that you want to use a date_histogram aggregation with
> interval=day. See
> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-bucket-datehistogram-aggregation.html
>
> On Thu, Jan 15, 2015 at 4:43 PM, buddarapu nagaraju  > wrote:
>
>> Hey Adrien ,Thank you.I have one more question on aggregating on dates .
>>
>> We actually stored date time in a field called "createdDateTime" but I
>> need only aggregates on date part of date time .
>>
>> Any ideas ? Or sample code  can help us ?
>>
>> Regards
>> Nagaraju
>> 908 517 6981
>>
>> On Wed, Jan 14, 2015 at 6:10 AM, Adrien Grand <
>> adrien.gr...@elasticsearch.com> wrote:
>>
>>>
>>>
>>> On Wed, Jan 14, 2015 at 10:37 AM, buddarapu nagaraju <
>>> budda08n...@gmail.com> wrote:
>>>
 Does term aggregation counts on blank field values ?


>>> Yes, an empty value "" counts as a term. Note that you need the field to
>>> be not analyzed for it to work (or to use an analyzer that emits empty
>>> strings). Otherwise the standard analyzer would analyzer "" as an empty
>>> list of tokens, so a field value of "" would not actually count...
>>>
>>>
 Does term aggregation is enough for doing date aggregation ? Or there
 any specific aggregations we have ?All I need in date aggregation is to
 know different dates and its counts ?

>>>
>>> A terms aggregation is enough, but a date_histogram aggregation is
>>> generally more useful on dates as there are lots of unique values and it's
>>> often more useful to group them based on the year, month or day.
>>>
>>> --
>>> Adrien Grand
>>>
>>> --
>>> You received this message because you are subscribed to a topic in the
>>> Google Groups "elasticsearch" group.
>>> To unsubscribe from this topic, visit
>>> https://groups.google.com/d/topic/elasticsearch/i9N09n_-n38/unsubscribe.
>>> To unsubscribe from this group and all its topics, send an email to
>>> elasticsearch+unsubscr...@googlegroups.com.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j74ZqbBN0zNW6-5Feu7xYTKkomzx%3DDMhx28inFVYLSu5Q%40mail.gmail.com
>>> 
>>> .
>>>
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>  --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to elasticsearch+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/CAFtuXXKp0JycJfNvLxPGN_5YL7P-X%3DGDzvmYJQ9NFN7Q%2BaJjQw%40mail.gmail.com
>> 
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
>
> --
> Adrien Grand
>
> --
> You received this message because you are subscribed to a topic in the
> Google Groups "elasticsearch" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/elasticsearch/i9N09n_-n38/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j7Nn8h7C9BoW6PUjHbS%2Bnerpw3%3DWUi5RrC5ewtDBtSRaA%40mail.gmail.com
> 
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAFtuXXJ-s14ZFCMi4RjFjePCpvW9-URP5T

Unicode characters and spaces in elasticsearch field names

2015-01-16 Thread George

Hello everybody, 


I've researched a little bit what characters are allowed in elasticsearch 
field names. 
However, I couldn't find any official documentation only some posts which 
mentioned that '.', '#' and '*' are discouraged. 
See 
http://elasticsearch-users.115913.n3.nabble.com/Illegal-characters-in-elasticsearch-field-names-td4054773.html.

I've  indexed some fields which contained spaces and unicode 
characters with elasticsearch 1.4.2 ("lucene_version": "4.10.2"). I was 
able to retrieve the documents with 
term query without any problems. 

My question would be, are there any pitfalls when using unicode characters 
and spaces in elasticsearch field names? or is this discouraged?


Many thanks, 
George

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/0c2a3b7b-0b3c-47e3-b149-ecad7d1e9a30%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Using icu_collation plugin in Unit Tests

2015-01-16 Thread joergpra...@gmail.com
You don't need to manually download the jar file if you use Maven. Add the
jar as dependency to your pom.xml


org.elasticsearch
elasticsearch-analysis-icu
2.4.1


Jörg

On Thu, Jan 15, 2015 at 10:47 PM, Kumar S  wrote:

> Thanks David!
>
> Sorry for being a new one in the ES world. But where would i download the
> JAR file from and what calss should i be using for the icu_collation?
>
> Thank you very much,
> Kumar Subramanian,
>
> On Thursday, January 15, 2015 at 12:52:12 PM UTC-8, David Pilato wrote:
>>
>> You most likely just need to add it as a dependency. Which is easy if you
>> are using maven.
>>
>> David
>>
>> Le 15 janv. 2015 à 21:03, Kumar S  a écrit :
>>
>> Hi,
>> I am new to ES. I am using NodeBuilder in my unit test to run a local
>> instance of ES. I would like to use the icu_collation plugin. How can i
>> install and run the plugin form within this local instance. Is there API
>> that i should use? if not, what are the different ways i can do this?
>>
>> Thank you very much,
>> Kumar Subramanian.
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to elasticsearc...@googlegroups.com.
>> To view this discussion on the web visit https://groups.google.com/d/
>> msgid/elasticsearch/5f3ebc39-4c13-4d1b-a888-bd101ab46136%
>> 40googlegroups.com
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/5a5e82b3-038b-4251-ae2c-f2216dc991f0%40googlegroups.com
> 
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGpyUHXde-P4e3YqrcXi2AduDevh-7LsgB7dA3TzStsvw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.