Re: 2.0 ETA

2015-04-20 Thread Matt Weber
Thanks Adrien!

On Mon, Apr 20, 2015 at 3:38 PM, Adrien Grand  wrote:

> Hi Matt,
>
> We have this meta issue which tracks what remains to be done before we
> release 2.0: https://github.com/elastic/elasticsearch/issues/9970. We
> plan to release as soon as we can but some of these issues are quite
> challenging so it's hard to give you an ETA. It should be a matter of
> months but I can't tell how many yet.
>
> However, even if the GA release might still take time, there will be beta
> releases before as we make progress through this checklist.
>
> Sorry if my answer does not give as much information as you hoped, we are
> all looking forward to this release and items on this checklist are very
> high priorities!
>
>
> On Mon, Apr 20, 2015 at 10:55 PM, Matt Weber  wrote:
>
>> Is there an ETA for 2.0?
>>
>> --
>> Thanks,
>> Matt Weber
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to elasticsearch+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/CAJ3KEoAa69%3D%2B4NC608ptZzGZmF%2BTiW72yCMikS%2BRKM3RX08KCg%40mail.gmail.com
>> <https://groups.google.com/d/msgid/elasticsearch/CAJ3KEoAa69%3D%2B4NC608ptZzGZmF%2BTiW72yCMikS%2BRKM3RX08KCg%40mail.gmail.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
>
> --
> Adrien
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CAO5%3DkAiRXH0w8QG_cO5cWgMdpVNwv9On%3Da6TbxWfD6%3DY6yjBHQ%40mail.gmail.com
> <https://groups.google.com/d/msgid/elasticsearch/CAO5%3DkAiRXH0w8QG_cO5cWgMdpVNwv9On%3Da6TbxWfD6%3DY6yjBHQ%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAJ3KEoAF%2Bv9v50SWQHR_%2BF68sSAm8wd1A_%3DA_UppiYFLhsyETg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


2.0 ETA

2015-04-20 Thread Matt Weber
Is there an ETA for 2.0?

-- 
Thanks,
Matt Weber

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAJ3KEoAa69%3D%2B4NC608ptZzGZmF%2BTiW72yCMikS%2BRKM3RX08KCg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Evaluating Moving to Discourse - Feedback Wanted

2015-04-09 Thread Matt Weber
+1 it looks really good.  Would the mailing list mode be enabled so we can
still get everything in our inbox if desired?

Thanks,
Matt Weber

On Thu, Apr 9, 2015 at 11:21 AM, Leslie Hawthorn  wrote:

> Thank you for your feedback, Glen! We're currently planning to use the
> hosted platform.
>
> Folks, please keep the feedback coming. We value your opinion.
>
> Cheers,
> LH
>
> On Thu, Apr 9, 2015 at 5:23 PM, Glen Smith  wrote:
>
>> +1 for recognizing the challenge, proactively approaching it, clearly
>> articulating the issues, and proposing a seemingly beneficial solution.
>>
>> Migrating off of gg would, IMO, be a good thing, for the reasons you
>> stated, plus numerous consequences of the "unsupported" state. e.g. nobody
>> is ever forever going
>> to invest anything in improving the user interface.
>>
>> I confess to not having heard of discourse before. I like that it's GPL.
>> Is the plan to use a Discourse-hosted instance, or for Elastic to launch
>> its own?
>>
>> In summary, +1.
>>
>> Cheers.
>>
>>
>> On Thursday, April 2, 2015 at 11:36:33 AM UTC-4, leslie.hawthorn wrote:
>>>
>>> Hello everyone,
>>>
>>> As we’ve begun to scale up development on three different open source
>>> projects, we’ve found Google Groups to be a difficult solution for dealing
>>> with all of our needs for community support. We’ve got multiple mailing
>>> lists going, which can be confusing for new folks trying to figure out
>>> where to go to ask a question.
>>>
>>> We’ve also found our lists are becoming noisy in the “good problem to
>>> have” kind of way. As we’ve seen more user adoption, and across such a wide
>>> variety of use cases, we’re getting widely different types of questions
>>> asked. For example, I can imagine that folks not using our Python client
>>> would rather not be distracted with emails about it.
>>>
>>> There’s also a few other strikes against Groups as a tool, such as the
>>> fact that it is no longer a supported product by Google, it provides no API
>>> hooks and it is not available for users in China.
>>>
>>> We’ve evaluated several options and we’re currently considering
>>> shuttering the elasticsearch-user and logstash-users Google Groups in favor
>>> of a Discourse forum. You can read more about Discourse at
>>> http://www.discourse.org
>>> <http://www.google.com/url?q=http%3A%2F%2Fwww.discourse.org&sa=D&sntz=1&usg=AFQjCNFQi_hMkuJsZ9_yWmYpibYIPzwS7w>
>>>
>>> We feel Discourse will allow us to provide a better experience for all
>>> of our users for a few reasons:
>>>
>>> * More fine grained conversation topics = less noise and better targeted
>>> discussions. e.g. we can offer a forum for each language client, individual
>>> logstash plugin or for each city to plan user group meetings, etc.
>>>
>>> * Facilitates discussions that are not generally happening on list now,
>>> such as best practices by use case or tips from moving to development to
>>> production
>>>
>>> * Easier for folks who are purely end users - and less used to getting
>>> peer support on a mailing list - to get help when they need it
>>>
>>> Obviously, Discourse does not function the exact same way as a mailing
>>> list - however, email interaction with Discourse is supported and will
>>> continue to allow you to participate in discussions over email (though
>>> there are some small issues related to in-line replies. [0])
>>>
>>> We’re working with the Discourse team now as part of evaluating this
>>> transition, and we know they’re working to resolve this particular issue.
>>> We’re also still determining how Discourse will handle our needs for both
>>> user and list archive migration, and we’ll know the precise details of how
>>> that would work soon. (We’ll share when we have them.)
>>>
>>> The final goal would be to move Google Groups to read-only archives, and
>>> cut over to Discourse completely for community support discussions.
>>>
>>> We’re looking at making the cut over in ~30 days from today, but
>>> obviously that’s subject to the feedback we receive from all of you. We’re
>>> sharing this information to set expectations about time frame for making
>>> the switch. It’s not set in stone. Our highest priority is to ensure
>>> effective migration of our list archives and subscribers, which may mean a
>>> long

Re: Accuracy issue of aggregation results

2014-09-16 Thread Matt Weber
Hi Yifan,

Nothing dynamic, but you can increase the number of terms collected on each
shard to increase the accuracy [1].  Might also want to play with the
shard_min_doc_count value if you know certain shards have a low hit count
and are throwing off the aggregations [2].

[1]
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html#_shard_size
[2]
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html#_minimum_document_count

Thanks,
Matt Weber


On Tue, Sep 16, 2014 at 12:36 PM, Yifan Wang 
wrote:

> It seems to be a common problem that the top N results returned from an
> aggregation query is inaccurate due to uneven distribution of matching
> documents on different shards, because ES will collect top N buckets from
> each shard no matter actually how many hits are on each shard. It is very
> often we collect buckets that should have not been collected on some
> shards, but we missed buckets that should have collected on some others.
>
> Is there a way we can collect buckets based on a dynamic "weight", for
> example "total hits", on that shard?
>
> Thanks in advance.
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/e78571f9-d3e3-4d7c-a60e-d1a2052db397%40googlegroups.com
> <https://groups.google.com/d/msgid/elasticsearch/e78571f9-d3e3-4d7c-a60e-d1a2052db397%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAJ3KEoCWieyr%3DW%2B_T0wxPr9L6_USLMKNQuMTNx0MOBQAaZ_VQA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: reduce-style aggregators

2014-08-09 Thread Matt Weber
You will be able to do this soon.  See:

https://github.com/elasticsearch/elasticsearch/pull/7075

Thanks,
Matt Weber
On Aug 9, 2014 10:44 AM, "James Cook"  wrote:

> There seems to be some reluctance by ES team to provide scrip table
> aggregators, or perhaps it's on a roadmap and just taking a long time.
> Kimchi has stated that he would like to identify these use cases and roll
> them into built-in aggregations so everyone can benefit. I think the range
> of these use cases is too broad for a specific set of implementations. The
> aggregations types included are fine for simple stats and bucketing, but
> there are probably hundreds of scenarios that require custom aggregation.
>
> I'll probably look into creating a custom aggregator for my use case.
>
> Note: a lot of the custom aggregators mentioned in this forum are general
> implementations of the reduce clause in a map reduce statement.
>
> My use case is the implementation of an Item Response Theory algorithm on
> a filtered result set. The idea is that a list of student responses to a
> question can be incrementally processed to result in a proficiency value.
> The filter (map) results are restricted to a timeframe and sorted, and the
> aggregation (reduce) step will incrementally inject each students score
> into an algorithm that progressively converges on the student's ability
> magnitude to answer those questions.
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/4b8725e6-c73b-4da7-abeb-330c1e0d2406%40googlegroups.com
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAJ3KEoAwv2m1-hLHGWnY6wF%3DDEQwDBqau_KN%2BMQTcgs%2BZooE0A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: n:m lookup filter

2014-07-25 Thread Matt Weber
It's currently blocked until we can figure out a way to prevent a bad query
from triggering an OOM error.  The goal (as far as I've been told) is to
get this in, but no ETA.   I need to update the PR to the latest master as
there have been significant changes as well.

Thanks,
Matt Weber
On Jul 25, 2014 8:52 PM, "Don Clore"  wrote:

> Does anyone know the status of that pull request?   Is it likely to be
> approved?
>
> thanks,
> Don
>
> On Saturday, July 19, 2014 12:14:01 AM UTC-7, Jörg Prante wrote:
>>
>> Yes, I think this is somehow related to Matt's Join Filter
>>
>> https://github.com/elasticsearch/elasticsearch/pull/3278
>>
>> Jörg
>>
>>
>> On Sat, Jul 19, 2014 at 4:24 AM, Don Clore  wrote:
>>
>>> I am pretty sure this is not supported, but it'd be great to explicit
>>> confirmation/denial.
>>>
>>> Sodocument types A and B, where there's an N:M relationship between
>>> A and B, and document type B has a list of the document A instances that
>>> relate to it.
>>>
>>> More concretely  A == a sports Player data type, and B is a set of new
>>> stories.   The Story type has a list of the ids of Players that the story
>>> is about/related to.
>>>
>>> SoI know the terms lookup filter allows one to use a single document
>>> as the source of the terms for the lookup.   What we'd like to be able to
>>> do is expose a faceted/aggregations-based UI to the user that allows her to
>>> perform a variety of filtering operations on Players over a fairly
>>> extensive set of criteria, and then have the resulting set of Player
>>> document ids serve as the lookup into the Story stories, i.e., get all the
>>> stories that relate to the Player result set.
>>>
>>> Obviously, we'd ideally like to do this in a single query, or failing
>>> that, have some reasonably efficient way to issue the two query/filters
>>> (passing a large result set of ids over the wire seems like a bad idea; I'm
>>> new to ES, but...this kind of thing was never great with Solr).
>>>
>>> One idea I had (perhaps half-baked) was to create a PlayerResultSet
>>> type, with an id deterministically fashioned from the query/filter
>>> predicates such that the same user filtering action would result in the
>>> same PlayerResultSet id each time; we'd issue a terms lookup filter request
>>> using the PlayerResultSet id, if it fails because the PlayerResultSet
>>> document doesn't exist, then we'd have to issue the filter for the Players,
>>> construct a PlayerResultSet doc and index it, and query for the Stories
>>> that have those Player Ids; not sure if it would be worse to issue all the
>>> ids in a query, or index the PlayerResultSet doc with Refresh==true (or
>>> issue the query and queue up the PlayerResultSet doc for later indexing, or
>>> whatever).
>>>
>>> The Player data should be fairly static; we could delete the documents
>>> and recreate them each time we refresh Player data.
>>>
>>> Ok, that sounds pretty awful, I'm hoping someone has a less
>>> Rube-Goldberg approach; obviously, I'm sort of building in my filter query
>>> caching mechanism, hopefully something like this can be more easily
>>> achieved with the built-in filter caching.
>>>
>>> thanks for any insights,
>>> Don
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to elasticsearc...@googlegroups.com.
>>> To view this discussion on the web visit https://groups.google.com/d/
>>> msgid/elasticsearch/91919a48-0892-4878-890b-e14c67fd40b5%
>>> 40googlegroups.com
>>> <https://groups.google.com/d/msgid/elasticsearch/91919a48-0892-4878-890b-e14c67fd40b5%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/22ef7166-a15a-430b-b0e2-3c99285fa380%40googlegroups.com
> <https://groups.google.com/d/msgid/elasticsearch/22ef7166-a15a-430b-b0e2-3c99285fa380%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAJ3KEoBh6pgaH1vfzFjtukCr0emkhsMovt1rMP9x7kt7p7uPRw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elasticsearch 1.3 Transform Scripts

2014-07-24 Thread Matt Weber
Yea, I really like it.  I have been thinking about the exact same thing for
a while but never had the time to put it together.  I do have some things I
would like to add such as the ability to stop a document from being indexed
when the doc has/does not have a specific value.  At any rate, great job!

Thanks,
Matt Weber


On Thu, Jul 24, 2014 at 7:01 PM, Nikolas Everett  wrote:

> I wanted to do conditional copy_to and Andrian suggested implementing
> scripted transforms instead. Much more flexible. They mesh well with the
> shift to groovy too because groovy is much more stable. Stable enough to
> run on every insert.
>
> I'm glad you are excited by it. It was fun to build and I hope lots of
> people enjoy it.
>
> Nik
> On Jul 24, 2014 9:22 PM, "Matt Weber"  wrote:
>
>> Just wanted to bring attention to the new and *very* useful transform
>> scripts that were introduced in elasticsearch 1.3 [1].  This feature allows
>> you to manipulate the source BEFORE it is indexed so you can do things like
>> add/remove fields, change field values, etc.  Groovy scripts will be the
>> default, but you can write native transform scripts as well [2].
>>
>> [1]
>> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-transform.html#mapping-transform
>> [2] https://github.com/imotov/elasticsearch-native-script-example/pull/7
>>
>> Thanks,
>> Matt Weber
>>
>>  --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to elasticsearch+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/5647d22b-9365-449d-922c-b5ed349c7826%40googlegroups.com
>> <https://groups.google.com/d/msgid/elasticsearch/5647d22b-9365-449d-922c-b5ed349c7826%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CAPmjWd2N6KNnJ%3D7kVtiyS3UE45j36ktGxT%3DiDuPsDxMuwKGAPw%40mail.gmail.com
> <https://groups.google.com/d/msgid/elasticsearch/CAPmjWd2N6KNnJ%3D7kVtiyS3UE45j36ktGxT%3DiDuPsDxMuwKGAPw%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAJ3KEoDfkUE%2B1%2BPGx-1KW%2BVQFXy2_JgvU9s27krist3uAV_9cQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Elasticsearch 1.3 Transform Scripts

2014-07-24 Thread Matt Weber
Just wanted to bring attention to the new and *very* useful transform 
scripts that were introduced in elasticsearch 1.3 [1].  This feature allows 
you to manipulate the source BEFORE it is indexed so you can do things like 
add/remove fields, change field values, etc.  Groovy scripts will be the 
default, but you can write native transform scripts as well [2].

[1] 
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-transform.html#mapping-transform
[2] https://github.com/imotov/elasticsearch-native-script-example/pull/7

Thanks,
Matt Weber

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/5647d22b-9365-449d-922c-b5ed349c7826%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Corss-index parent/child relationship

2014-06-26 Thread Matt Weber
I have not tested routing but I did put that functionality in so it should
work fine.  Let me know if you have any issues!

Thanks,
Matt Weber



On Thu, Jun 26, 2014 at 7:20 PM, Drew Kutcharian  wrote:

> Thanks Matt, that feature is exactly what we need. One thing I couldn’t
> figure out was that I would be able to pass a routing key so only relevant
> shards would be queried, right?
>
>
>
> On Jun 26, 2014, at 8:14 AM, Matt Weber  wrote:
>
> See PR #3278.  Hopefully it will get merged into one of the next releases.
>
> https://github.com/elasticsearch/elasticsearch/pull/3278
>
> Thanks,
> Matt Weber
>
>
>
> On Thu, Jun 26, 2014 at 12:10 AM, Thomas  wrote:
>
>> Hi,
>>
>> Unfortunately this is not supported by elasticsearch, the parent document
>> and the child document must be under the same index or else the rounting
>> will not be established. You can either try coping the parent document if
>> they are not many or you can use an other way to split your data like with
>> a hash function and to ensure that both parent and child document will be
>> indexed into the same index.
>>
>> Hope it helps
>> Thomas
>>
>> On Wednesday, 25 June 2014 04:48:48 UTC+3, Drew wrote:
>>>
>>> Hi!
>>>
>>> Does ES support cross-index parent/child relationship? More
>>> specifically, can I have all the parents in one index (say users) and the
>>> children (say events) in a multiple time series style (managed by curator)
>>> indices? If so, how is this done? If not, what’s the alternative?
>>>
>>> Thanks,
>>>
>>> Drew
>>
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to elasticsearch+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/d1d73ca5-8d7f-4515-83bb-87f956f5fd83%40googlegroups.com
>> <https://groups.google.com/d/msgid/elasticsearch/d1d73ca5-8d7f-4515-83bb-87f956f5fd83%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CAJ3KEoDq74hQ-OSByFs2HotMtb7aAUfM1zfHwTaMBpQ542o0EA%40mail.gmail.com
> <https://groups.google.com/d/msgid/elasticsearch/CAJ3KEoDq74hQ-OSByFs2HotMtb7aAUfM1zfHwTaMBpQ542o0EA%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/0E6751DE-2040-4F12-829E-2F729E7DE053%40venarc.com
> <https://groups.google.com/d/msgid/elasticsearch/0E6751DE-2040-4F12-829E-2F729E7DE053%40venarc.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAJ3KEoCpxwUw9510Cp7SWpQNA0Zs8kTY7bUcKWYDpW9mEgHYpw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Corss-index parent/child relationship

2014-06-26 Thread Matt Weber
See PR #3278.  Hopefully it will get merged into one of the next releases.

https://github.com/elasticsearch/elasticsearch/pull/3278

Thanks,
Matt Weber



On Thu, Jun 26, 2014 at 12:10 AM, Thomas  wrote:

> Hi,
>
> Unfortunately this is not supported by elasticsearch, the parent document
> and the child document must be under the same index or else the rounting
> will not be established. You can either try coping the parent document if
> they are not many or you can use an other way to split your data like with
> a hash function and to ensure that both parent and child document will be
> indexed into the same index.
>
> Hope it helps
> Thomas
>
> On Wednesday, 25 June 2014 04:48:48 UTC+3, Drew wrote:
>>
>> Hi!
>>
>> Does ES support cross-index parent/child relationship? More specifically,
>> can I have all the parents in one index (say users) and the children (say
>> events) in a multiple time series style (managed by curator) indices? If
>> so, how is this done? If not, what’s the alternative?
>>
>> Thanks,
>>
>> Drew
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/d1d73ca5-8d7f-4515-83bb-87f956f5fd83%40googlegroups.com
> <https://groups.google.com/d/msgid/elasticsearch/d1d73ca5-8d7f-4515-83bb-87f956f5fd83%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAJ3KEoDq74hQ-OSByFs2HotMtb7aAUfM1zfHwTaMBpQ542o0EA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Trigram-accelerated regex searches

2014-05-22 Thread Matt Weber
Leading wildcards are really expensive.  Maybe you can try creating a copy
of your "content" field that reverses the tokens using reverse token filter
[1].  By doing this you turn those expensive leading wildcards into
trailing wildcards which should give you better performance.  I think your
query would look something like this:

{
  "query": {
"constant_score": {
  "query": {
"bool": {
  "should": [
{"wildcard": {"content": "Children*Next*"}},
{"wildcard": {"content_rev": "txeN*nerdlihC*"}}
  ]
}
  }
}
  }
}

Note that you will need to reverse your query string as the wildcard query
is not analyzed.

[1]
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-reverse-tokenfilter.html#analysis-reverse-tokenfilter

Thanks,
Matt Weber


On Thu, May 22, 2014 at 11:09 AM, Erik Rose  wrote:

> Martijn took a swing at it just now. He eliminated any scoring-based
> slowdown, like so (constant_score_filter)…
>
> curl -s -XGET 'http://127.0.0.1:9200/dxr_test/line/_search?pretty' -d
> '{
> "query": {
> "filtered": {
> "query": {
> "match_all": {}
> },
> "filter": {
> "and": [
> {
> "query": {
> "match_phrase": {
> "content_trg": "Children"
> }
> }
> },
> {
> "query": {
> "match_phrase": {
> "content_trg": "Next"
> }
> }
> },
> {
> "query": {
> "wildcard": {
> "content": {
> "wildcard": "*Children*Next*",
> "rewrite": "constant_score_filter"
> }
> }
> }
> }
> ]
> }
> }
> }
> }'
>
> …but it didn't make any difference. Somehow, the `and` pipeline isn't
> behaving as we expect. Since ES can't provide any more detailed timing
> ouput, I guess the next step is to go look at the source code for the `and`
> filter and the wildcard query and see what's what.
>
> I think we'd both be fascinated to know what's going on, if anyone has
> anything to add.
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/3114f40c-0b15-4dd4-8a6b-fc8c13d43f23%40googlegroups.com<https://groups.google.com/d/msgid/elasticsearch/3114f40c-0b15-4dd4-8a6b-fc8c13d43f23%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>



-- 
Thanks,
Matt Weber

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAJ3KEoA1fQjbkygEBhxZdMcb%3D22JGDph65qNn1cvkE66NLRn3A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: ElasticSearch vs Solr integration with Tomcat6

2014-05-09 Thread Matt Weber
I would be best to manage elasticsearch outside of tomcat and use the java
or rest api to communicate with ES from within your app.  If you absolutely
must run ES within tomcat, have a look at the wares transport[1].

[1] https://github.com/elasticsearch/elasticsearch-transport-wares

Thanks,
Matt Weber


On Fri, May 9, 2014 at 8:02 AM, Hariharan Vadivelu wrote:

> unlike SOLR , elasticsearch does not require a Java container, however you
> can always instantiate ES in embedded mode within your J2EE application.
> more details available here.
>
> http://www.elasticsearch.org/guide/en/elasticsearch/client/java-api/current/client.html
>
>
> On Friday, May 9, 2014 7:43:56 AM UTC-5, anass benjelloun wrote:
>>
>> Hello,
>>
>> I need to compare the both solutions ElasticSearch and Solr, then choose
>> one of them to integrate it on my webapp, so i'm using tomcat6 server and i
>> installed/Configured Solr.war in my webapp without any probleme then i
>> search to integrate ElasticSearch with tomcat i didn't find enough
>> informations.
>> so if some one know steps to do or know any other solution to my
>> probleme, describe that step by step.
>> thanks,
>>
>> regards,
>> Anass BENJELLOUN
>>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/7e785706-d2ee-4bb8-9609-8232d12842f2%40googlegroups.com<https://groups.google.com/d/msgid/elasticsearch/7e785706-d2ee-4bb8-9609-8232d12842f2%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAJ3KEoCthUummqv16MdgNiA%2BR5YzxUhhDRrzXTW89N8H6Jjx4A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Query and Filter

2014-04-18 Thread Matt Weber
Did you reindex your docs after updating the mapping?  Can you post your
mapping and original docs?

On Friday, April 18, 2014, Matt Hughes  wrote:

> Thanks for the quick reply!
>
> I updated the mappings and confirmed both types read not_analyzed.   I
> also updated the query to use bool/must:
>
> {
>"from":0,
>"size":200,
>"query":{
>   "filtered":{
>  "query":{
> "query_string":{
>"fields":[
>   "_all"
>],
>"query":"\"Test message from AT by user admin was
> generated\""
> }
>  },
>  "filter":{
> "bool":{
>"must":[
>   {
>  "term":{
> "where.appId":
> "12229ac6-8e9a-43ff-ab67-e80f3c585a69"
>  }
>   },
>   {
>  "term":{
> "where.processId":
> "bd13dbe5-0a4c-4469-a645-44cb3fde280a"
>  }
>   }
>]
> }
>  }
>   }
>}
> }
>
> Still not getting any hits though.  Tried escaping the terms.  Is there
> anything special about having nested field names like that
> 'where.processId'?
>
> On Friday, April 18, 2014 4:07:31 PM UTC-4, Matt Weber wrote:
>>
>> Chances are your appId and processId fields are analyzed so it is
>> breaking up the id's.  Update your mapping of these fields so it is not
>> analyzed [1].  Also, you should not use an "and" filter to combine term
>> filters.  Use a boolean filter [2] with must clauses for better
>> performance.  Read why at http://www.elasticsearch.org/blog/all-about-
>> elasticsearch-filter-bitsets/.
>>
>>
>> [1] http://www.elasticsearch.org/guide/en/elasticsearch/
>> reference/current/mapping-core-types.html#string
>> [2] http://www.elasticsearch.org/guide/en/elasticsearch/
>> reference/current/query-dsl-bool-filter.html
>>
>> Thanks,
>> Matt Weber
>>
>>
>>
>> On Fri, Apr 18, 2014 at 12:52 PM, Matt Hughes  wrote:
>>
>>> Trying to compose a query and filter combination to no avail:
>>>
>>> {
>>>"from":0,
>>>"size":200,
>>>"query":{
>>>   "filtered":{
>>>  "query":{
>>> "query_string":{
>>>"fields":[
>>>   "_all"
>>>],
>>>"query":"\"Test message\""
>>> }
>>>  },
>>>  "filter":{
>>> "and":[
>>>{
>>>   "term":{
>>>  "appId":"a32b782c-3c51-4d76-9b01-c4c1ffe53d8b"
>>>   }
>>>},
>>>{
>>>   "term":{
>>>  "processId":"754311ef-d807-4bb4-8c5e-1b480fb7034f"
>>>   }
>>>}
>>> ]
>>>  }
>>>   }
>>>}
>>> }
>>>
>>> That parses fine by ES, but never returns the results.  I know the two
>>> fields are correct and in my index.  If I take off the 'filter', I get the
>>> expected results, but I need the filter to narrow the results.  When I
>>> compose the same query using Kibana, it tries to use an 'ffilter' query
>>> which I don't see documented anywhere:
>>>
>>> "filter": {
>>>
>>> "bool": {
>>>   "must": [
>>>
>>> {
>>>   "terms": {
>>>
>>> "_type": [
>>>   "event"
>>>
>>> ]
>>>   }
>>> },
>>> {
>>>
>>>   "fquery": {
>>> "query"
>>>
>>>  --
>>> You received this message because you are subscribed to the Google
>>> Groups 

Re: Query and Filter

2014-04-18 Thread Matt Weber
Chances are your appId and processId fields are analyzed so it is breaking
up the id's.  Update your mapping of these fields so it is not analyzed
[1].  Also, you should not use an "and" filter to combine term filters.
 Use a boolean filter [2] with must clauses for better performance.  Read
why at
http://www.elasticsearch.org/blog/all-about-elasticsearch-filter-bitsets/.


[1]
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-core-types.html#string
[2]
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-bool-filter.html

Thanks,
Matt Weber



On Fri, Apr 18, 2014 at 12:52 PM, Matt Hughes  wrote:

> Trying to compose a query and filter combination to no avail:
>
> {
>"from":0,
>"size":200,
>"query":{
>   "filtered":{
>  "query":{
> "query_string":{
>"fields":[
>   "_all"
>],
>"query":"\"Test message\""
> }
>  },
>  "filter":{
> "and":[
>{
>   "term":{
>  "appId":"a32b782c-3c51-4d76-9b01-c4c1ffe53d8b"
>   }
>},
>{
>   "term":{
>  "processId":"754311ef-d807-4bb4-8c5e-1b480fb7034f"
>   }
>}
> ]
>  }
>   }
>}
> }
>
> That parses fine by ES, but never returns the results.  I know the two
> fields are correct and in my index.  If I take off the 'filter', I get the
> expected results, but I need the filter to narrow the results.  When I
> compose the same query using Kibana, it tries to use an 'ffilter' query
> which I don't see documented anywhere:
>
> "filter": {
> "bool": {
>   "must": [
> {
>   "terms": {
> "_type": [
>   "event"
> ]
>   }
> },
> {
>   "fquery": {
> "query": {
>   "query_string": {
> "query": 
> "appId:(\"a32b782c-3c51-4d76-9b01-c4c1ffe53d8b\")"
>   }
> },
> "_cache": true
>   }
> }
>   ]
> }
>
>
> Any pointers would be most appreciated.  Pulling my hair out here.
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/666c3b42-222d-420b-9997-5b660713396d%40googlegroups.com<https://groups.google.com/d/msgid/elasticsearch/666c3b42-222d-420b-9997-5b660713396d%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAJ3KEoBc0EmeY5yUo0juR5EUuOR%3DmuaROPbYKJJ9u7qP_-HB9w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Solr SearchComponent-like functionality?

2014-04-18 Thread Matt Weber
Well, the scripts runs against all matching documents of the query so you
can do a match_all query [1] to have the logic applied to all your
documents.  This is going to be expensive though, so try to filter out as
many documents as possible before applying the custom scoring.  Maybe even
perform a rescore [2] on the top X docs.  It really all depends on your
requirements though.  Run some tests and tune based on those results.

When I said to be careful. I mean don't do a lot of blocking IO or long
running calculations as the script is ran against each matching document.
 Cache results and make the script return as quick as possible.

[1]
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-match-all-query.html
[2]
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-rescore.html

Thanks,
Matt Weber



On Fri, Apr 18, 2014 at 9:46 AM, Srinivasan Ramaswamy wrote:

> Thats great, thanks for your reply. This looks like a good solution for my
> requirement ! Is this script applied in each shard ? I want to apply this
> function to all the documents so that the Top N picked from each shard is
> picked by my custom score.
>
> Also, can you elaborate a little bit on "be careful you can significantly
> impact your query performance if you are not careful". I would like to
> understand the best practices there.
>
> On Friday, April 18, 2014 8:14:54 AM UTC-7, Matt Weber wrote:
>>
>> Yes, you can use the Function Score Query [1] in combination with a
>> native script written in java [2].  With the native script you can
>> basically do whatever you want, but be careful you can significantly impact
>> your query performance if you are not careful.
>>
>> [1] http://www.elasticsearch.org/guide/en/elasticsearch/
>> reference/current/query-dsl-function-score-query.html
>> [2] http://www.elasticsearch.org/guide/en/elasticsearch/
>> reference/current/modules-scripting.html#_native_java_scripts
>>
>> Thanks,
>> Matt Weber
>>
>>
>> On Thu, Apr 17, 2014 at 11:54 PM, Srinivasan Ramaswamy > > wrote:
>>
>>> I would like to influence the ranking with few fields that are not
>>> stored in the index (eg click data for keyword-documents). I have used
>>> custom SearchComponent in Solr to implement similar functionality in the
>>> past. I am wondering how can i achieve the same in ElasticSearch.
>>>
>>> I know this thread is a very old thread, but i didnt find much
>>> information on how to do custom scoring (in elasticsearch) with data thats
>>> not stored in the index. This thread looked very relevant to my
>>> requirement, so trying to see whether you guys have solved similar
>>> requirements with elasticsearch.
>>>
>>> Thanks
>>> Srini
>>>
>>> On Wednesday, September 7, 2011 12:18:09 PM UTC-7, Lukáš Vlček wrote:
>>>>
>>>> Hi Otis,
>>>>
>>>> So if I understand it correctly (providing my knowledge is quite
>>>> limited here) you are asking if
>>>> 1) it is possible to hook into query processing flow and inject or
>>>> extend custom handlers for individual flow phases and
>>>> 2) if we can find in ES the same functionality which is currently
>>>> provided by components listed here: http://wiki.apache.org/s
>>>> olr/SearchComponent (or here: http://lucene.apache.org/solr/
>>>> api/org/apache/solr/handler/component/SearchComponent.html).
>>>>
>>>> As for #1, frankly, I do not know. I have been playing with plugins a
>>>> bit but did not have a chance to explore full potential of it yet. I
>>>> remember that Shay mentioned that not every aspect of ES is pluggable now
>>>> but that is all I know about it (personally, I did not hit the limits by
>>>> myself yet, may be I would if I wanted to employ Carrot2 clustering or
>>>> something like that)
>>>>
>>>> As for #2, if you are after one-to-one comparison of Solr
>>>> SearchComponents and ES then I think we would find some matches and also
>>>> some misses. Still it could be an interesting exercise to do (although we
>>>> should be careful to include only those features that do work well in
>>>> distributed environment). We could probably end up identifying new feature
>>>> requests, so this can be useful.
>>>>
>>>> Regards,
>>>> Lukas
>>>>
>>>> On Wed, Sep 7, 2011 at 6:17 PM, Otis Gospodnetic >>> > wrote:
>>>>
>>>>> Hi Lukas,
>>>>>
>>

Re: Solr SearchComponent-like functionality?

2014-04-18 Thread Matt Weber
Yes, you can use the Function Score Query [1] in combination with a native
script written in java [2].  With the native script you can basically do
whatever you want, but be careful you can significantly impact your query
performance if you are not careful.

[1]
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-function-score-query.html
[2]
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-scripting.html#_native_java_scripts

Thanks,
Matt Weber


On Thu, Apr 17, 2014 at 11:54 PM, Srinivasan Ramaswamy
wrote:

> I would like to influence the ranking with few fields that are not stored
> in the index (eg click data for keyword-documents). I have used custom
> SearchComponent in Solr to implement similar functionality in the past. I
> am wondering how can i achieve the same in ElasticSearch.
>
> I know this thread is a very old thread, but i didnt find much information
> on how to do custom scoring (in elasticsearch) with data thats not stored
> in the index. This thread looked very relevant to my requirement, so trying
> to see whether you guys have solved similar requirements with elasticsearch.
>
> Thanks
> Srini
>
> On Wednesday, September 7, 2011 12:18:09 PM UTC-7, Lukáš Vlček wrote:
>>
>> Hi Otis,
>>
>> So if I understand it correctly (providing my knowledge is quite limited
>> here) you are asking if
>> 1) it is possible to hook into query processing flow and inject or extend
>> custom handlers for individual flow phases and
>> 2) if we can find in ES the same functionality which is currently
>> provided by components listed here: http://wiki.apache.org/
>> solr/SearchComponent (or here: http://lucene.apache.org/solr/
>> api/org/apache/solr/handler/component/SearchComponent.html).
>>
>> As for #1, frankly, I do not know. I have been playing with plugins a bit
>> but did not have a chance to explore full potential of it yet. I remember
>> that Shay mentioned that not every aspect of ES is pluggable now but that
>> is all I know about it (personally, I did not hit the limits by myself yet,
>> may be I would if I wanted to employ Carrot2 clustering or something like
>> that)
>>
>> As for #2, if you are after one-to-one comparison of Solr
>> SearchComponents and ES then I think we would find some matches and also
>> some misses. Still it could be an interesting exercise to do (although we
>> should be careful to include only those features that do work well in
>> distributed environment). We could probably end up identifying new feature
>> requests, so this can be useful.
>>
>> Regards,
>> Lukas
>>
>> On Wed, Sep 7, 2011 at 6:17 PM, Otis Gospodnetic 
>> wrote:
>>
>>> Hi Lukas,
>>>
>>> Yes, SearchComponents are about extensibility, but specifically about
>>> extending how queries are handled within Solr once Solr gets them.  I
>>> know ES has other types of plugins, and you've listed several of them,
>>> but I'm wondering about which of them is SearchComponent-like.
>>> I've looked at http://www.elasticsearch.org/guide/reference/modules/
>>> plugins.html
>>> , but couldn't find the answer to my Q there.  Maybe I'm looking at
>>> the wrong place?
>>>
>>> Thanks,
>>> Otis
>>> --
>>> Sematext is hiring Search Engineers -- http://sematext.com/about/
>>> jobs.html
>>>
>>> On Sep 6, 2:57 pm, Lukáš Vlček  wrote:
>>> > Hi,
>>> >
>>> > I am not Solr expert but to me it seems that SearchComponents in Solr
>>> are
>>> > about extensibility of out of the box functionality. If that is the
>>> case
>>> > then I would say that we can talk about plugins in ES world. Although
>>> there
>>> > is no official doc about how to implement custom plugins yet it is
>>> really
>>> > not difficult. Apart from that there are several plugins that are part
>>> of
>>> > distribution (river plugins, attachments mapper, ICU analysis,
>>> scripting
>>> > languages ... to name a few) and they can be used as an inspiration if
>>> a new
>>> > plugin implementation is needed.
>>> >
>>> > My 2 cents.
>>> >
>>> > Lukas
>>> >
>>> > On Tue, Sep 6, 2011 at 5:35 PM, Otis Gospodnetic <
>>> otis.gospodne...@gmail.com
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> > > wrote:
>>> > > Hello,
>

Re: Java Serialization of Exceptions

2014-03-21 Thread Matt Weber
If this is a concern, why not have your client's use the REST api so they
don't need to worry about matching their java version with the java version
of the search cluster?

Thanks,
Matt Weber



On Fri, Mar 21, 2014 at 12:07 PM, kimchy  wrote:

> Not trivializing the bug at all, god knows I spend close to a week tracing
> it down to a JVM backward incompatibility change, but this happened once
> over the almost 5 years Elasticsearch existed. To introduce a workaround to
> something that happened once, compared to potential bugs in the workaround
> (Jackson is great, but what would happen if there was a bug in it for
> example) is not a great solution. Obviously, if this happened more often,
> then this is something we need to address.
>
> On Friday, March 21, 2014 7:12:02 PM UTC+1, Chris Berry wrote:
>
>> If it happened once, then by definition it will happen again. History
>> repeats itself. ;-)
>>
>> What exactly would you lose?
>> You are simply trading one rigid serialization scheme for another more
>> lenient one.
>> Yes, you would have to introduce something like Jackson’s Object Mapper,
>> but that seems to be the defacto standard today and with your use of the
>> Shade Plugin it wouldn’t really be a burden on the Client anyway.
>>
>> With all due respect, you may be trivializing the impact of this one time
>> bug.
>> It is difficult, at best, to inform all the Clients of your Cluster;
>> “Hey, if you want to see what your Exceptions really are, then upgrade your
>> JVM”
>> Especially in large SOA shops
>>
>> This just decouples the Client and Server deployments.
>>
>> Thanks much,
>> — Chris
>>
>> On Mar 21, 2014, at 12:18 PM, kimchy  wrote:
>>
>> I wonder why you are asking for this feature? If its because Java broke
>> backward comp. on serialization of InetAddress that we use in our
>> exceptions, then its a bug in Java serialization, hard for us to do
>> something about it.
>>
>> You will loose a lot by trying to serialize exceptions using JSON, and we
>> prefer not to introduce dependency on ObjectMapper in Jackson, or try and
>> serialize exceptions using Jackson.
>>
>> I would be very careful in introducing this just because of a (one time
>> bug) in Java.
>>
>> On Friday, March 21, 2014 5:18:38 PM UTC+1, Chris Berry wrote:
>>>
>>> Greetings,
>>>
>>> Let me say up-front, I am a huge fan and proponent of Elasticsearch. It
>>> is a beautiful tool.
>>>
>>> So, that said, it surprises me that Elasticsearch has such a pedestrian
>>> flaw, and serializes it's Exceptions using Java Serialization.
>>> In a big shop it is quite difficult (i.e. next to impossible) to keep
>>> all the ES Clients on the same exact JVM as Elasticsearch, and thus, it is
>>> not uncommon to get TransportSerializationExceptions instead of the
>>> actual underlying problem.
>>> I was really hoping this would be corrected in ES 1.0.X, but no such
>>> luck. (As far as I can tell...)
>>>
>>> It seems that this is pretty easily fixed?
>>> Just switch to a JSON representation of the basic Exception and
>>> gracefully (forwards-compatibly) attempt to re-hydrate the actual Exception
>>> class.
>>> You'd just have to drop an additional "header" in the stream that tells
>>> the code it is a JSON response and route to the right Handler it
>>> accordingly. If the header is missing, then do things the old way with Java
>>> Serialization??
>>>
>>> Are there any plans to fix this? Or perhaps to entertain a Pull Request?
>>> It may seem just an annoyance, but it is actually pretty bad, in that it
>>> keeps Clients from seeing their real issues. Especially in shops where it
>>> is difficult to see the Production logs of Elasticsearch itself.
>>>
>>> Thanks much,
>>> -- Chris
>>>
>>>
>>>
>> --
>> You received this message because you are subscribed to a topic in the
>> Google Groups "elasticsearch" group.
>> To unsubscribe from this topic, visit https://groups.google.com/d/
>> topic/elasticsearch/7bpam7mWjY8/unsubscribe.
>> To unsubscribe from this group and all its topics, send an email to
>> elasticsearc...@googlegroups.com.
>>
>> To view this discussion on the web visit https://groups.google.com/d/
>> msgid/elasticsearch/6ae5f173-a2b4-435c-8e5d-a43d377e2fb0%
>> 40googlegroups.com<https://groups.google.com/d/msgid/elasticsearch/6ae5f173-a2b4-435c-8e5d-a43d377e2fb0%40googlegroups

Re: questions about aggregation min_doc_count = 0

2014-03-18 Thread Matt Weber
1.  The histogram aggregation (and facet) work on indexed values not based
on the current time or "now".  So, if the last indexed document timestamp
is 3/15/14T16:15 you will not get empty buckets between 3/15/14T16:15 and the
current time.  It would be interesting to be able to set the "to" and
"from" on histogram based aggregations to allow for generating buckets on
intervals between the defined range.

2.  I believe this is the way the keys are pulled from the fielddata which
is index level data.  So if you are using the "all" index you are going to
get data from all indices.  Not sure if this is a bug or not.  You can try
applying a filter aggregation:

POST _all/summary_phys/_search
{
  "aggs": {
"summary_phys_events": {
  "filter": {
"type": {"value": "summary_phys_events"}
  },
  "aggs": {
"events_by_date": {
  "date_histogram": {
"field": "@timestamp",
"interval": "300s",
"min_doc_count": 0
  },
  "aggs": {
"events_by_host": {
  "terms": {
"field": "host.raw",
"min_doc_count": 0
  },
  "aggs": {
"avg_used": {
  "avg": {
"field": "used"
  }
},
"max_used": {
  "max": {
"field": "used"
  }
}
  }
}
  }
}
  }
}
  }
}





On Tue, Mar 18, 2014 at 12:39 PM, John Stanford wrote:

> Hi,
>
> I'm trying to get a better understanding of aggregations, so here are a
> couple of questions that came up recently.
>
> Question 1:
>
> I have some time based data that I am using aggregations to chart.  The
> data may be sparsely populated, so I've been setting min_doc_count to 0 so
> I get empty buckets back anyway.  I've noticed that it will fill in empty
> buckets unless they are before or after the first record of the range.
>
> For example, if I use a query similar to the one below, and there are no
> records after 3/15/14T16:15, the last aggregation record will be for
> 3/15/14T16:15.  On the other hand, if there is a gap in between the start
> time and 3/15/14T16:15, I will get a bucket with a 0 doc count (as
> expected).
>
> POST _all/summary_phys/_search
>
> {
>"aggs": {
>   "events_by_date": {
>  "date_histogram": {
> "field": "@timestamp",
> "interval": "300s",
> "min_doc_count": 0
>  },
>  "aggs": {
> "events_by_host": {
>"terms": {
>   "field": "host.raw"
>},
>"aggs": {
>   "avg_used": {
>  "avg": {
> "field": "used"
>  }
>   },
>   "max_used": {
>  "max": {
> "field": "used"
>  }
>   }
>}
> }
>  }
>   }
>}
> }
>
> Not getting the 0 doc count buckets back at the front and back of the
> range seems contrary to the documented purpose of min_doc_count.  Am I
> doing something wrong?
>
> Question 2:
>
>
> If I add a min_doc_count = 0 to the inner aggregation, but limit the
> search to a specific doc type like:
>
>   doc type
>v
> POST _all/summary_phys/_search
> {
>"aggs": {
>   "events_by_date": {
>  "date_histogram": {
> "field": "@timestamp",
> "interval": "300s",
> "min_doc_count": 0
>  },
>  "aggs": {
> "events_by_host": {
>"terms": {
>   "field": "host.raw",
>   "min_doc_count": 0
>},
>"aggs": {
>   "avg_used": {
>  "avg": {
> "field": "used"
>  }
>   },
>   "max_used": {
>  "max": {
> "field": "used"
>  }
>   }
>}
> }
>  }
>   }
>}
> }
>
> I get buckets with entries matching hosts that do not show up in this doc
> type.  For example, I have only 3 values for host in this doc type
> [compute-4, compute-2, compute-3], but I will get buckets back with hosts
> from other doc types like:
>
> "events_by_host": {
>   "buckets": [
>  {
> "key": "compute-4",
> "doc_count": 11,
> "max_used": {
>"value": 4608
> },
> "avg_used": {
>"value": 3677.090909090909
> }
>  },
>  

Re: How to join 2 indexes at query time

2014-02-26 Thread Matt Weber
How about using parent/child functionality?

https://gist.github.com/mattweber/96f3515fc4453a5cb0db

Thanks,
Matt Weber



On Wed, Feb 26, 2014 at 7:45 PM, Jayesh Bhoyar wrote:

> Hi Binh,
>
> Thanks for the answer.
>
> Is there any case if I index this data into same index with different
> category GIST@ https://gist.github.com/jsbonline2006/9243973
> I have 1 index:
>
> productindex/ Type: offertype
> productindex/ Type: categorytype
>
>
> Now as per my index data:
> My input will be category "Flat TV"
> And in output: I want all skuid for "Flat TV" and there corresponding 
> offer_id.
>
> Regards,
> Jayesh Bhoyar
>
> *GIST @https://gist.github.com/jsbonline2006/9243973 
> <https://gist.github.com/jsbonline2006/9243973>*
>
>
> On Wednesday, February 26, 2014 8:07:01 PM UTC+5:30, Binh Ly wrote:
>>
>> Unfortunately, ES is not like SQL in this respect. You'll need to
>> denormalize somewhat because ES is more "document-oriented". You'd probably
>> need to either denormalize offer_id into categorytype, or category into
>> offertype to get all the data you want returned in 1 query.
>>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/bdd1dd5e-be45-4faa-a01f-f6e491249d65%40googlegroups.com
> .
> For more options, visit https://groups.google.com/groups/opt_out.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAJ3KEoCF1hPXeapnrXyPpv3h%3DSetwCPN2MUSV%3DYtNiwW286HWA%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: 1.0.0RC1 broken changes

2014-01-15 Thread Matt Weber
Use the master docs

http://www.elasticsearch.org/guide/en/elasticsearch/reference/master/index.html

Looks like your call should be:

/_nodes/4oL7COyQTNiQPa4xZ76Pfg/stats?all=true&plugin=true

Thanks,
Matt Weber



On Wed, Jan 15, 2014 at 12:31 PM, Roy Russo  wrote:

> Hello,
>
> So it appears a few APIs have changed in 1.0.0RC1 (which is bizarre
> considering I tested on master just last week and everything worked, but
> whatever...) Any ideas when the documentation will be updated to reflect
> the changes? I'm having a hard time mapping old API calls to the new
> urls...
>
> For instance,
>
> THIS: 
> /_cluster/nodes/4oL7COyQTNiQPa4xZ76Pfg/stats?all=true&plugin=true<http://localhost:9200/_cluster/nodes/4oL7COyQTNiQPa4xZ76Pfg/stats?all=true&plugin=true>
>
> IS NOW WHAT? /_???
>
> The breaking changes page is not fine-grained enough in mapping new to old
> calls... at least for me. ;-)
>
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/0333d260-1232-4f68-8681-88536cbefa5f%40googlegroups.com
> .
> For more options, visit https://groups.google.com/groups/opt_out.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAJ3KEoAezZ6sqXVpO1_Uqrxw8t9Z86dim6P2C8Hts97XDpAVtw%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Filter and Query same taking some time

2014-01-09 Thread Matt Weber
Use a filtered query, not an outer filter.   You only want to use that
outer filter when you are faceting and don't want the filter to change the
facet counts.

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-filtered-query.html

Thanks,
Matt Weber


On Thu, Jan 9, 2014 at 1:13 AM, Arjit Gupta  wrote:

> I had 13 Million documents and with the same query
> I see Filters performing worse then query
> filters are taking 400ms where as query is taking 300 ms
>
> 1. Filter
>
> {
>   "size" : 100,
>   "query" : {
> "match_all" : { }
>   },
>   "filter" : {
> "bool" : {
>   "must" : {
> "term" : {
>   "color" : "red"
> }
>   }
> }
>   },
>   "version" : true
> }
>
>
> 2. Query
>
> {
>   "size" : 100,
>   "query" : {
> "bool" : {
>   "must" : {
> "match" : {
>   "color" : {
> "query" : "red",
> "type" : "boolean",
> "operator" : "AND"
>   }
> }
>   }
> }
>   },
>   "version" : true
> }
>
> Thanks ,
> Arjit
>
>
> On Thu, Jan 9, 2014 at 1:15 PM, David Pilato  wrote:
>
>> Yeah 10 documents is not that much!
>> Not sure if you can notice a difference here as probably everything could
>> be loaded in file system cache.
>>
>> --
>> *David Pilato* | *Technical Advocate* | *Elasticsearch.com*
>> @dadoonet <https://twitter.com/dadoonet> | 
>> @elasticsearchfr<https://twitter.com/elasticsearchfr>
>>
>>
>> Le 9 janvier 2014 at 08:43:13, Arjit Gupta 
>> (arjit...@gmail.com)
>> a écrit:
>>
>> I have 100,000 documents  which are similar. In response I am getting the
>> whole document not just Id.
>> I am executing the query multiple times.
>>
>> Thanks ,
>> Arjit
>>
>>
>> On Thu, Jan 9, 2014 at 1:06 PM, David Pilato  wrote:
>>
>>>  You probably won't see any difference the first time you execute it
>>> unless you are using warmers.
>>>  With a second query, you should see the difference.
>>>
>>>  How many documents you have in your dataset?
>>>
>>>  --
>>> *David Pilato* | *Technical Advocate* | *Elasticsearch.com*
>>> @dadoonet <https://twitter.com/dadoonet> | 
>>> @elasticsearchfr<https://twitter.com/elasticsearchfr>
>>>
>>>
>>> Le 9 janvier 2014 at 06:14:06, Arjit Gupta 
>>> (arjit...@gmail.com)
>>> a écrit:
>>>
>>>   Hi,
>>>
>>> I had implemented ES search query  for all our use cases but when i came
>>> to know that some of our use cases can be solved by filters I implemented
>>> that but I dont see any gain (in response time) in filters. My search
>>> queries  are
>>>
>>> 1. Filter
>>>
>>> {
>>>   "size" : 100,
>>>   "query" : {
>>> "match_all" : { }
>>>   },
>>>   "filter" : {
>>> "bool" : {
>>>   "must" : {
>>> "term" : {
>>>   "color" : "red"
>>> }
>>>   }
>>> }
>>>   },
>>>   "version" : true
>>> }
>>>
>>>
>>> 2. Query
>>>
>>> {
>>>   "size" : 100,
>>>   "query" : {
>>> "bool" : {
>>>   "must" : {
>>> "match" : {
>>>   "color" : {
>>> "query" : "red",
>>> "type" : "boolean",
>>> "operator" : "AND"
>>>   }
>>> }
>>>   }
>>> }
>>>   },
>>>   "version" : true
>>> }
>>>
>>> By default the term query should be cached but I dont see a performance
>>> gain.
>>> Do i need to change some parameter also  ?
>>> I am using ES  0.90.1 and with 16Gb of heap space given to ES.
>>>
>>> Thanks,
>>> Arjit
>>>  --
>>>  You received this message because you are subscribed to the Google
>>> Groups "elasticsearch" gro

Re: How to configure and implement Synonyms with multi words.

2014-01-09 Thread Matt Weber
He is a little example of query time multi-word synonyms:

https://gist.github.com/mattweber/7374591

Hope this helps.

Thanks,
Matt Weber



On Thu, Jan 9, 2014 at 12:56 AM, Jayesh Bhoyar wrote:

> Also I have another scenario where my index is having words like
>
> software engineer, se, ---> this should get seached when I do search on
> Software engineer
> team lead, lead, tl ---> this should get seached when I do search on Team
> Lead
>
>
>
> Following are the query to create the records.
>
> curl -XPUT 
> 'http://localhost:9200/employee/test/11?pretty<http://localhost:9200/employee/test/1?pretty>'
> -d '{"designation": "software engineer"}'
> curl -XPUT 
> 'http://localhost:9200/employee/test/12?pretty<http://localhost:9200/employee/test/2?pretty>'
> -d '{"designation": "se"}'
> curl -XPUT 
> 'http://localhost:9200/employee/test/13?pretty<http://localhost:9200/employee/test/3?pretty>'
> -d '{"designation": "sse"}'
> curl -XPUT 
> 'http://localhost:9200/employee/test/14?pretty<http://localhost:9200/employee/test/4?pretty>'
> -d '{"designation": "senior software engineer"}'
> curl -XPUT 
> 'http://localhost:9200/employee/test/15?pretty<http://localhost:9200/employee/test/5?pretty>'
> -d '{"designation": "team lead"}'
> curl -XPUT 
> 'http://localhost:9200/employee/test/16?pretty&refresh=true<http://localhost:9200/employee/test/6?pretty&refresh=true>'
> -d '{"designation": "tl"}'
> curl -XPUT 
> 'http://localhost:9200/employee/test/17?pretty&refresh=true<http://localhost:9200/employee/test/6?pretty&refresh=true>'
> -d '{"designation": "lead"}'
>
>
>
> On Thursday, January 9, 2014 2:12:05 PM UTC+5:30, Jayesh Bhoyar wrote:
>>
>> Hi,
>>
>> I have following Synonyms that I want to configure.
>>
>> software engineer => software engineer, se,
>> senior software engineer => senior software engineer , see
>> team lead => team lead, lead, tl
>>
>> So that If I searched for se or Software Engineer it should return me the
>> records having software engineer.
>>
>> What mapping I should apply on Designation field? and what query I should
>> fire to get the result
>> It is possible to use multi_match query?
>>
>> Following are the query to create the records.
>>
>> curl -XPUT 'http://localhost:9200/employee/test/1?pretty' -d
>> '{"designation": "software engineer"}'
>> curl -XPUT 'http://localhost:9200/employee/test/2?pretty' -d
>> '{"designation": "software engineer"}'
>> curl -XPUT 'http://localhost:9200/employee/test/3?pretty' -d
>> '{"designation": "senior software engineer"}'
>> curl -XPUT 'http://localhost:9200/employee/test/4?pretty' -d
>> '{"designation": "senior software engineer"}'
>> curl -XPUT 'http://localhost:9200/employee/test/5?pretty' -d
>> '{"designation": "team lead"}'
>> curl -XPUT 'http://localhost:9200/employee/test/6?pretty&refresh=true'
>> -d '{"designation": "team lead"}'
>>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/71574363-4a46-4471-be9e-6ef1b0938d60%40googlegroups.com
> .
>
> For more options, visit https://groups.google.com/groups/opt_out.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAJ3KEoDVd04Mx3Wh1ZEtpgoSeekrhGaGUCCO5Lut%3DnKgdhOGiw%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Plain filter and constant_score

2013-12-30 Thread Matt Weber
The "outer" filter is basically a post filter, ie. filtering happens after
all the documents have been collected via the query.  This should not
really be used unless you are trying to do something like multi-select
faceting where you don't want facet counts to be affected by the filter.
 You should be using a filtered query [1] or as you discovered a constant
score query if you only want to execute a filter.

BTW, in elasticsearch 1.0, this outer filter has been renamed to
"post_filter" to avoid some of the confusion.

Thanks,
Matt Weber



On Mon, Dec 30, 2013 at 2:53 PM, Han JU  wrote:

> Hi,
>
> We are currently benchmarking our ES setup so I've got some new questions:
>
> 1. We found out that, for the same query (filter actually), when put like
> this:
>
> {
>   "filter": {...},
>   "fields": [...]
> }
>
> is consistently slower than this form:
>
> {
>   "query": {
>  "constant_score": {
>  "filter": {...},
>  ...
>   }
>   }
> }.
>
> All filter and fields part are identical, but the performance is
> different, especially when caches are warm, filters wrapped in a
> constant_score are nearly 10x faster than when they are put directly.
> So what happens behind this? How ElasticSearch interpret filters that are
> put directly (not wrapped in any outer structure)?
>
> Thanks in advance.
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/ad91095f-7e7b-4def-8570-73dbccaa7cb2%40googlegroups.com
> .
> For more options, visit https://groups.google.com/groups/opt_out.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAJ3KEoBhOub8U5zc_DzsecE4LrsxF40PUBaDCvnwX%3DWrO4f50w%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.