Re: How to get more than 10 terms bucket with Java API?
Thx. That did the trick. I now realize that I was setting the size on the return value of addAggregation(), when in fact, I needed to set it on the return value of the field() method. On Tuesday, 29 July 2014 15:56:29 UTC-4, Adrien Grand wrote: > > Hi, > > You need to set the size on the terms aggregation: > > AggregationBuilders.terms("category_names").field("category").size(20) > > > On Mon, Jul 28, 2014 at 9:22 PM, Alain Désilets > wrote: > >> I have an ES where I have indexed a bunch of files. Each file is tagged >> with a category field. I want to write Java code to get a list of all the >> categories. I am trying to do this using the terms aggregation: >> >> QueryBuilder qb = QueryBuilders.matchAllQuery(); >> SearchResponse sr = >> esClient.prepareSearch() >> .setQuery(qb) >> .setSize(20) >> >> .addAggregation(AggregationBuilders.terms("category_names").field("category")).setSize(20) >> .execute() >> .actionGet(); >> >> Terms terms = sr.getAggregations().get("category_names"); >> int numCategories = terms.getBuckets().size(); >> System.out.println("Number of category buckets found: "+ numCategories); >> >> But this code always prints out that it found 10 category buckets. Yet, >> if I execute the following query in Sense: >> >> GET files-index/file_with_category/_search >> { >> "query": { >> "match_all": {} >> }, >> "aggs": { >> "categories": { >> "terms": { >> "field": "category", >> "size": 100 >> } >> } >> } >> } >> >> I get 18 category buckets. What am I doing wrong in the Java code? >> >> Thx. >> >> Alain >> >> >> -- >> You received this message because you are subscribed to the Google Groups >> "elasticsearch" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to elasticsearc...@googlegroups.com . >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/elasticsearch/b6d0300c-524e-423b-bd5a-6ce9d9e7b168%40googlegroups.com >> >> <https://groups.google.com/d/msgid/elasticsearch/b6d0300c-524e-423b-bd5a-6ce9d9e7b168%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> > > > > -- > Adrien Grand > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/f4ff78a9-f042-43c6-ad61-d11ba5fb7ff1%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
How to get more than 10 terms bucket with Java API?
I have an ES where I have indexed a bunch of files. Each file is tagged with a category field. I want to write Java code to get a list of all the categories. I am trying to do this using the terms aggregation: QueryBuilder qb = QueryBuilders.matchAllQuery(); SearchResponse sr = esClient.prepareSearch() .setQuery(qb) .setSize(20) .addAggregation(AggregationBuilders.terms("category_names").field("category")).setSize(20) .execute() .actionGet(); Terms terms = sr.getAggregations().get("category_names"); int numCategories = terms.getBuckets().size(); System.out.println("Number of category buckets found: "+ numCategories); But this code always prints out that it found 10 category buckets. Yet, if I execute the following query in Sense: GET files-index/file_with_category/_search { "query": { "match_all": {} }, "aggs": { "categories": { "terms": { "field": "category", "size": 100 } } } } I get 18 category buckets. What am I doing wrong in the Java code? Thx. Alain -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/b6d0300c-524e-423b-bd5a-6ce9d9e7b168%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Retrieving document groups with MoreLikeThis query
Say I have a set of documents D={d1,..., dn} indexed in ES. I also have a set of document groups G={g1,...,gk}, where the groups may not be disjoint. Given a group g in G, called the Query Group, I want to find "similar" groups g' in G, called the Similar Groups. where "similar" means that the documents in g tend to use the same words as documents in each of the g'. What would be the easiest way do this with ES? Note that I already know how to find individual documents d which are similar to a group g. You just do a MoreLikeThis search, feeding it a multi-get query with the IDs of the various documents in g. But this will retrieve individual documents, when in fact what I want document groups. I can think of several ways of doing this (see below). Not sure what the pros and cons of each approach are, in terms of speed and "accuracy". Does anyone have advice on this? Thx (PS Sorry for the long post, but it's a pretty complex issue) Alain === Approach 1: Groups as pseudo-documents that concatenate content of their members Create a type 'documents-group'. These are "pseudo-documents" whose content will be the concatenation of the content of all documents in a given group. You then do a MoreLikeThis (MLT) query, feeding it the pseudo-document for the Query Group, and asking specifically for documents of type documents-group PROS: - Can be supported for sure. CONS: - Need to write code in the application to create and manage those pseudo-documents. - The content of each document will be duplicated N times, where N is the number of groups that it belongs to. -- This means the index will be much larger (in my application, a document can belong to hundreds of groups) - Requires a lot of reindexing -- Everytime you change a document, you need to re-index all the groups it belongs to. -- If you add/remove a document to a group, you need to re-index that group, which means reindex the equivalent of all the documents it contains (in my application, a group may contain 100 documents or more) === Approach 2: Groups as pseudo-documents containing only the most important terms This is similar to approach 1, except that the pseudo-document contains a list of the most significant terms of documents in that group (obtained with the Significant Terms aggregation), as opposed to containing a concatenation of all the member documents. You then do a MoreLikeThis (MLT) query, feeding it the documents-group document for the Query Group, and asking specifically for documents of type documents-group PROS: - Can be supported for sure. - The content of individual documents is not completely duplicated. At words, a handful of terms from each document may be duplicated. CONS: - Need to write code in the application to create and manage those document-group pseudo-documents. - Requires a lot of reindexing -- Everytime you change a document, you need to re-generate the the list of significant terms for all the groups it belongs to. -- If you add/remove a document to a group, you re-generate the the list of significant terms for that group. - May be less accurate, as similarity is based on only the most significant terms, instead of all available terms. In other words, it could be that terms which seem a-priori unimportant, turn out to be very important when trying to determine similarity to the Query Group. === Approach 3: Multi-get MLT, with automatic aggregation of average score on the groups Each document includes a field that specifies the groups it belongs to. You then do a MLT search, feeding it a Multi-Get query of the IDs of all the documents in the Query Group. The query also asks ES to do a metric aggregate using the Group membership field as the basis for the aggregation, and the average MLT score as the metric to be computed. The group that has the highest MLT average score is considered to be the most relevant group. PROS: - No need to write application code to create and manage pseudo-documents for the groups. - Does not requires too much reindexing -- If you modify a document, you only need to reindex that document. -- If you add/remove a document in a group, you only need to reindex that one document (because the list of groups that the document belongs to has changed). CONS: - Not clear that it can be done. Possible problems are: -- The groups are overlapping. In other words, a given document may appear in more than one parts of the aggregation. Not sure if ES aggregation deals properly with that. -- Not sure that you can use the MLT score as the metric to be averaged. The reason being that the score is a dynamic property of each document, whose value is determined specifically by MLT at query time. Metric aggregation is typically used with static attributes (ex: a document's author). === Approach 4: Multi-get MLT, with "manual" aggregation of average score on the groups This is very similar to Approach 2, except that the aggregation of MLT scores
Re: Need help with Java API
On Tuesday, 22 July 2014 14:45:28 UTC-4, Jörg Prante wrote: > > Noo bulk request API is fantistic and a must! > > Just take a bit care to not pass nulls or empty requests to it. > Thx for the encouragement. I perservered a bit more and then realized that I had misunderstood how to create an IndexRequest. I fixed the problem and it's now working. I have updated the gist demo code accordingly. Alain -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/f40afe21-42a9-40af-a6a0-0779eb6a9d41%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Need help with Java API
On Tuesday, 22 July 2014 14:21:29 UTC-4, Jörg Prante wrote: > > To Issue 5: unfortunately, there are some hidden NPEs in the bulk request > API. > Ah, OK. I guess I'll stay away from the bulk request API then. Alain -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/f6a4d5df-52fd-4ad7-899f-964d7b7d451a%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Need help with Java API
Thx Jörg, Your comments helped me get quite a bit further. Here is an updated version of my code: https://gist.github.com/alaindesilets/aec9492890c37075fa4e On Monday, 21 July 2014 13:13:58 UTC-4, Jörg Prante wrote: > > To issue 1: you create a single node cluster without index, and a client > of it. > Duh! Don't know how I could have missed that. So now, I added a method createIndex() which creates the index if it doesn't exist, and sets number_of_replicas to 2. It doesn't define mappings since the ES doc says that default mappings will be automatically generated if none are specified. Note however that even with that change, I still couldn't see the new index in Marvel/Sense, when I instantiated the client with method makeClientFromEmbeddedNode(). But if I instantiate the client through a TransportClient, i.e. by invoking method makeClientFromTransportClient(), then Marvel/Sense sees the new index. Using a TransportClient instead of a client obtained from an embedded node also greatly accelerated the client creation. Instead of taking > 8 secs to create the client, it takes < 1 sec. Not sure why. > > To issue 2: you see the UnavailableShardsException caused by a timeout > while indexing to a replica shard. This means, you may have set up a single > node cluster, but with replica level 1 (default) which needs 2 nodes for > indexing. Maybe there was once another node joining the cluster and ES > wants it back abut it never came (after 60 secs). Then ES returns the > timeout error. Maybe replica level 0 helps. You should also check the > cluster health. A color of green shows everything works, yellow means there > are too few nodes to satisfy the replica condition, and read means the > cluster is not in a consistent/usable state. > I tried setting number_replica=0 and number_replica=1 in createIndex(), but I still got the org.elasticsearch.action.UnavailableShardsException error if I instantiate the client with makeClientFromEmbeddedNode(). But I don't get the error if I instantiate the client with makeClientFromTransportClient(), independantly of the number of replicas I specify. > To issue 3: not sure what clusterName() means. I would use settings and > add a field "cluster.name". Maybe it is equivalent. You must ensure you > use the same "cluster.name" setting throughout all nodes and clients. You > also can not reuse data from clusters that had other names (look into the > "data" folder) > I have now moved that code to a method called makeClientFromNamedClusterNode(). I haven't been able to make that work at all. To issue 4: ES takes ~5 secs for discovery, the zen modules pings and waits > for responding master nodes by default. If you just test locally on your > developer machine, you should disable zen. Most easily by disabling > networking at all, by using NodeBuilder.nodeBuilder().local(true)... > Not sure I understand what that means, but in any case, using a TransportClient seems to address that issue. So, all in all, I feel I am in pretty good shape now. Thanks for the help. There is one new Issue I an now encountering, when I try to do a bulk indexing, namely: Issue 5: If I uncomment the call to index100NewBeersInOneBatch(), I get the following: == Indexing 100 new beer objects as in one batch... (Elapsed so far: 1.0 seconds) Adding beer no 0 to the bulk request. (Elapsed so far: 1.0 seconds) Exception in thread "main" java.lang.NullPointerException at org.elasticsearch.action.bulk.BulkRequest.internalAdd(BulkRequest .java:129) at org.elasticsearch.action.bulk.BulkRequest.add(BulkRequest.java: 118) at org.elasticsearch.action.bulk.BulkRequestBuilder.add( BulkRequestBuilder.java:52) at ca.nrc.ElasticSearch.ElasticSearchDemo.index100NewBeersInOneBatch (ElasticSearchDemo.java:158) at ca.nrc.ElasticSearch.ElasticSearchDemo.main(ElasticSearchDemo. java:72) Any thoughts on what might be going on there? Thx. Alain -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c3f6c329-4dc4-4714-8e6b-bae1b7fa1f51%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Need help with Java API
I am trying to get started with the Java API, using the excellent tutorial found here: http://www.slideshare.net/dadoonet/hands-on-lab-elasticsearch But I am still having a lot of trouble. Below is a sample of code that I have written: package ca.nrc.ElasticSearch; import org.codehaus.jackson.map.ObjectMapper; import org.elasticsearch.action.get.GetResponse; import org.elasticsearch.action.index.IndexResponse; import org.elasticsearch.client.Client; import org.elasticsearch.node.NodeBuilder; public class ElasticSearchRunner { static ObjectMapper mapper; static Client client; static String indexName = "meal5"; static String typeName = "beer"; static long startTimeMSecs; public static void main(String[] args) throws Exception { startTimeMSecs = System.currentTimeMillis(); mapper = new ObjectMapper(); // create once, reuse echo("Creating the ElasticSearch client..."); client = NodeBuilder.nodeBuilder().node().client(); // Does this create a brand new cluster? // client = NodeBuilder.nodeBuilder().clusterName("handson").client(true).node().client(); // Joins existing cluster called "handson" echo("DONE creating the ElasticSearch client... Elapsed time = "+elapsedSecs()+" secs."); echo("Creating a beer object..."); Beer beer = new Beer("Heineken", Colour.PALE, 0.33, 3); String jsonString = mapper.writeValueAsString(beer); echo("DONE Creating a beer object..."); echo("Indexing the beer object..."); IndexResponse ir = null; ir = client.prepareIndex(indexName, typeName).setSource(jsonString) .execute().actionGet(); echo("DONE Indexing the beer object..."); echo("Retrieving the beer object..."); GetResponse gr = null; gr = client.prepareGet(indexName, typeName, ir.getId()).execute() .actionGet(); echo("DONE Retrieving the beer object..."); } public static float elapsedSecs() { float elapsed = (System.currentTimeMillis() - startTimeMSecs)/1000; return elapsed; } public static void echo(String mess) { mess = mess + " (Elapsed so far: "+elapsedSecs()+" seconds)"; System.out.println(mess); } } It works, "sort of"... If I use the first method for creating the client: client = NodeBuilder.nodeBuilder().node().client(); Then it works fin the first time I run it. However: *** ISSUE 1: If I try to inspect the meal index with Marvel, I don't find it. Also, *** ISSUE 2: If I run the application a second time, I get the following output: Creating the ElasticSearch client... (Elapsed so far: 0.0 seconds) log4j:WARN No appenders could be found for logger (org.elasticsearch.node). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. DONE creating the ElasticSearch client... Elapsed time = 9.0 secs. (Elapsed so far: 9.0 seconds) Creating a beer object... (Elapsed so far: 9.0 seconds) DONE Creating a beer object... (Elapsed so far: 9.0 seconds) Indexing the beer object... (Elapsed so far: 9.0 seconds) Exception in thread "main" org.elasticsearch.action.UnavailableShardsException: [meal5][0] [2] shardIt, [0] active : Timeout waiting for [1m], request: index {[meal5][beer][B3F5ZEmSTruqdnlxhYviFg], source[{"brand":"Heineken","colour":"PALE","size":0.33,"price":3.0}]} at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.raiseTimeoutFailure(TransportShardReplicationOperationAction.java:526) at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$3.onTimeout(TransportShardReplicationOperationAction.java:516) at org.elasticsearch.cluster.ClusterStateObserver$ObserverClusterStateListener.onTimeout(ClusterStateObserver.java:239) at org.elasticsearch.cluster.service.InternalClusterService$NotifyTimeout.run(InternalClusterService.java:494) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) For me to be able to run the application again, I have to change the indexName variable to a different value (ex: "meal6"). Also: *** ISSUE 3: If I try using the second method for creating a Client, then it doesn't work. More specifically, if I use this method: client = NodeBuilder.nodeBuilder().clusterName("handson").client(true).node().client(); This yields: Creating the ElasticSearch client... (Elapsed so far: 0.0 seconds) log4j:WARN No appenders could be found for logger (org.elasticsearch.node). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. DONE creating the ElasticSearch client... Elapsed time = 34.0 secs. (Elapsed so far: 34.0 seconds) Creating a beer object... (Elapsed so far: 34.0 seconds) DONE Creating a beer object... (Elapsed so far: 34.0 seconds) Indexing the beer object... (Elapsed so far: 34.0 seconds) Exception in thread "main"
Re: Tutorial on Java interface to ElasticSearch?
Thanks David. This is exactly what I was looking for. Alain On Thursday, 3 July 2014 10:10:31 UTC-4, David Pilato wrote: > > Hey Alain, > > > You can have a look at this: > https://github.com/elasticsearchfr/hands-on/tree/answers > > http://www.elasticsearch.org/guide/en/elasticsearch/client/java-api/current/index.html > > I hope this could help. > > -- > *David Pilato* | *Technical Advocate* | *Elasticsearch.com* > @dadoonet <https://twitter.com/dadoonet> | @elasticsearchfr > <https://twitter.com/elasticsearchfr> > > > Le 3 juillet 2014 à 16:08:15, Alain Désilets (alainde...@gmail.com > ) a écrit: > > I am a total newbie to ElasticSearch, and I have been using the following > tutorial to get started: > > > http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/getting-started.html > > I'm finding it really easy to follow and informative. > > Now, I want to use the Java API to write a Java program that uses > ElasticSearch, but I can't find an equally helpful tutorial for the Java > API. I am surfing and searching through the API documentation, but can't > figure out how to do even easy operations like: create an index, put a > document into it, retrieve it, etc... > > Of course, I could always write methods that send HTTP requests, at which > point I could rely on what I learned through the above tutorial. But I > would prefer to use a more direct link provided by the Java API. > > Does someone know of a good tutorial on the Java API? > > Thx. > > Alain Désilets > -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to elasticsearc...@googlegroups.com . > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/b0723c1e-8f14-4df2-a80e-a4a4646503a0%40googlegroups.com > > <https://groups.google.com/d/msgid/elasticsearch/b0723c1e-8f14-4df2-a80e-a4a4646503a0%40googlegroups.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/9732af28-e9a7-4bf5-92e2-7d70bfbad26a%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Tutorial on Java interface to ElasticSearch?
I am a total newbie to ElasticSearch, and I have been using the following tutorial to get started: http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/getting-started.html I'm finding it really easy to follow and informative. Now, I want to use the Java API to write a Java program that uses ElasticSearch, but I can't find an equally helpful tutorial for the Java API. I am surfing and searching through the API documentation, but can't figure out how to do even easy operations like: create an index, put a document into it, retrieve it, etc... Of course, I could always write methods that send HTTP requests, at which point I could rely on what I learned through the above tutorial. But I would prefer to use a more direct link provided by the Java API. Does someone know of a good tutorial on the Java API? Thx. Alain Désilets -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/b0723c1e-8f14-4df2-a80e-a4a4646503a0%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Error trying to use FSRiver
On Tuesday, 1 July 2014 15:13:34 UTC-4, David Pilato wrote: > > Sadly, I can't make any promise here. I think it will be before the end of > July though. > I understand. Thanks. Alain -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/b5ce2b45-e75c-4289-8f20-c62c6b55a0ec%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Error trying to use FSRiver
OK, thx. Any timeline for when the upgrade will be available? Alain On Tuesday, 1 July 2014 14:57:54 UTC-4, David Pilato wrote: > > Yeah. Still on my TODO list to upgrade it! > > :) > > -- > David ;-) > Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs > > > Le 1 juil. 2014 à 20:24, Alain Désilets > a écrit : > > I am trying to use the FSRiver with elastic-1.2.1. > > I installed the plugin as follows: > > bin\plugin -install fr.pilato.elasticsearch.river/fsriver/1.0.0 > > I then tried to restart ES by doing: > bin\service.bat manager > Stop > Start > > But afterwards, ES still has "Service status: stopped". If I look in the > log file, I see the following error: > > [2014-07-01 14:19:51,072][ERROR][bootstrap] {1.2.1}: > Initialization Failed ... > - ExecutionError[java.lang.NoClassDefFoundError: > org/elasticsearch/rest/XContentThrowableRestResponse] > NoClassDefFoundError[org/elasticsearch/rest/XContentThrowableRestResponse] > > ClassNotFoundException[org.elasticsearch.rest.XContentThrowableRestResponse] > > I searched for this error and didn't find anything. > > Can anyone see what I might be doing wrong here? > > Is this a compatibility issue? If I look on the FSRiver page: > > http://www.pilato.fr/fsriver/ > > I see that it only mentions ES 1.0.x and I am using 1.2.1. > > Thx. > > Alain > > -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to elasticsearc...@googlegroups.com . > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/a545a6b0-168c-4679-b20d-551f8defa555%40googlegroups.com > > <https://groups.google.com/d/msgid/elasticsearch/a545a6b0-168c-4679-b20d-551f8defa555%40googlegroups.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/508b2fc6-9fda-4ed2-be0f-cd06d80dfc5e%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Error trying to use FSRiver
I am trying to use the FSRiver with elastic-1.2.1. I installed the plugin as follows: bin\plugin -install fr.pilato.elasticsearch.river/fsriver/1.0.0 I then tried to restart ES by doing: bin\service.bat manager Stop Start But afterwards, ES still has "Service status: stopped". If I look in the log file, I see the following error: [2014-07-01 14:19:51,072][ERROR][bootstrap] {1.2.1}: Initialization Failed ... - ExecutionError[java.lang.NoClassDefFoundError: org/elasticsearch/rest/XContentThrowableRestResponse] NoClassDefFoundError[org/elasticsearch/rest/XContentThrowableRestResponse] ClassNotFoundException[org.elasticsearch.rest.XContentThrowableRestResponse] I searched for this error and didn't find anything. Can anyone see what I might be doing wrong here? Is this a compatibility issue? If I look on the FSRiver page: http://www.pilato.fr/fsriver/ I see that it only mentions ES 1.0.x and I am using 1.2.1. Thx. Alain -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/a545a6b0-168c-4679-b20d-551f8defa555%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
How to do MoreLikeTHESE in Elastic Search?
I just LOVE the MoreLikThis (MLT) feature of ElasticSearch and am finding new and interesting ways of using it all the time. However these days, I find myself needing a MoreLikeTHESE feature. In other words, given a group of documents that fit together, I want to find other documents that might belong in that group. Is there a "native" way of doing that with the ES API? For example, is it possible to do MoreLikeThis on an Aggregation? I looked in the API documentation and didn't find anything. If this is not supported natively, what would be the "best" way to implement this using the existing blocks? I can think of at least two: == Approach 1: Concatenated Pseudo-Document == Create a pseudo-document whose content is the concatenation of the content of all documents in the group. Add that "document" to the index, then do the regular MoreLikeThis on that document. The disadvantage of this approach is that the Pseudo-Document could become very large. == Approach 2: Multiterm Vector Pseudo-Document == Retrieve the multiterms of all the documents in the group. Create a pseudo document that contains only the most frequent or most "important" multi-terms. Do the regular MoreLikeThis on this pseudo-document. -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/d7d5e14d-6a02-4d60-b797-351337b14e77%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.