from:"Ivan Brusic"

Re: Dealing with spam in this forum

2015-09-28 Thread Ivan Brusic

Does the new mailing list have moderators to deal with spam?

Cheers,

Ivan
Hi all

Recently we've had a few spam emails that have made it through Google's
filters, and there have been a calls for us to change to a
moderate-first-post policy. I am reluctant to adopt this policy for the
following reasons:

We get about 30 new users every day from all over the world, many of whom
are early in their learning phase and are quite stuck - they need help as
soon as possible. Fortunately this list is very active and helpful. In
contrast, we've only ever banned 34 users from the list for spamming.  So
making new users wait for timezones to swing their way feels like a heavy
handed solution to a small problem. Yes, spammers are annoying but they are
a small minority on this list.

Instead, we have asked 10 of our long standing members to help us with
banning spammers.  This way we have Spam Guardians active around the globe,
who only need to do something if a spammer raises their ugly head above the
parapet. One or two spam emails may get through, but hopefully somebody
will leap into action and stop their activity before it becomes too
tiresome.

This isn't an exclusive list. If you would like to be on it, feel free to
email me.  Note: I expect you to be a long standing and currently active
member of this list to be included.

If this solution doesn't solve the problem, then we can reconsider
moderate-first-post, but we've managed to go 5 years without requiring it,
and I'd prefer to keep things as easy as possible for new users.

Clint

-- 
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/c9af5a09-0295-42e3-bc20-52471828aa96%40googlegroups.com

.
For more options, visit https://groups.google.com/d/optout.

-- 
Please update your bookmarks! We have moved to https://discuss.elastic.co/
--- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQBOQV77cDdDyHtVM_nK4obFiDJQQ7GWAXdNmE6K7Up4mA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Forums Are Now Live at http://discuss.elastic.co

2015-07-07 Thread Ivan Brusic

I cannot reply to any posts. I keep getting: Email issue -- Unknown Reply
Key

Any status on the existing issues?

Cheers,

Ivan
On May 12, 2015 5:47 PM, Leslie Hawthorn leslie.hawth...@elastic.co
wrote:

 It is entirely possible some of these items can be fixed. Investigating.

 Thank you for the candid feedback, Doug, Ivan and Jörg.

 Cheers,
 LH

 On Mon, May 11, 2015 at 10:24 PM, joergpra...@gmail.com 
 joergpra...@gmail.com wrote:

 +1 for Doug and Ivan

 I'd also like to find the real names (which are available at Discuss
 because they are shown in the profile of a user) being added to the message
 view and the mail From header, for a more personal communication style.
 It would be easier to begin a reply with a greeting then.

 Hopefully this is received as a constructive feedback and not as moaning
 about Discuss software.

 Best,

 Jörg


 On Mon, May 11, 2015 at 12:03 AM, Doug Turnbull 
 dturnb...@opensourceconnections.com wrote:

 Oh! I didn't see that. Thank you. I will look again.

 Yeah I actually like many things about discourse. I also like many of
 the low friction aspects of a mailing list. I am worried my email
 sounds too negative. I should list positives like great markdown support
 and much prettier and more legible emails. Discourse is phenomenal forum
 software. I honestly hope they can really bridge the gap effortlessly to
 have as part of their capabilities truly effortless implementation of a dev
 mailing list with an optional forum view.

 I'm sure I'll be using discourse to great effect (and it seems to be
 being used). So I could be completly wrong about all my points. :)

 Doug

 On Sunday, May 10, 2015, Ivan Brusic i...@brusic.com wrote:

 I should have added something similar to what does expressed in his
 last paragraph. My feedback was meant to be constructive. Despite this
 being a technical mailing list, I still appreciate a more personal touch.

 BTW, Doug, you can watch specific categories. There is a general
 watch-all setting that you might have turned on.

 Ivan
 On May 10, 2015 2:37 PM, Doug Turnbull 
 dturnb...@opensourceconnections.com wrote:

 I agree.

 While I appreciate the experimentation, I hope discourse can get to a
 point where I can forget it's even a forum. Whereby everything can just
 truly feel like a native mailing list with a great deal less friction, but
 it's not there yet.

 Here's some of the missing features:

 - a signup process that's more an email subscription (just paste in an
 email address) without having to create any kind of forum account or
 profile.

 - the ability to reply to someone individually to their email address.
 The ability for others to email me directly without discourses PM feature.

 - It needs the ability to subscribe per topic. Things are a bit noisy
 right now (ie I'm not interested in log stash)

 - eliminate noreply in the email notifications. I don't feel like I
 should participate via email when I see this. Give it a friendly name.

 Right now use via email feels second class. But I think it's the most
 important thing. I'm likely to scan my low priority inbox where mailing
 list emails are sent. I'm going to struggle to remember to check in on and
 participate in a forum to help folks. It's another place to go and all my
 other OSS mailing lists come to my email and I can work with them
 seemlessly. So I'm likely to forget to check or possibly not want to 
 bother
 with elastic which uses a different system.

 Yes, I do get the notifications, but it doesn't quite feel the same as
 a mailing list for the reasons above. It feels like a notification from
 another system.

 Anyway long and frank email. Forgive the bluntness. I just wanted to
 express hopefully useful feedback. I do appreciate the thoughtfulness
 here. I know elastic and discourse folks are very smart. Email truly can
 become a first class experience and keep some of the great things about
 discourse.

 Cheers!
 Doug

 On Sunday, May 10, 2015, Ivan Brusic i...@brusic.com wrote:

 I really do not care for the new mailing list.

 First of all, I can no longer see real names and email addresses. All
 I see is whatever nonsensical handle someone choose on sign up. Searching
 for Adrien no longer returns his latest posts.

 Second, since every email comes from nore...@discuss.elastic.co, I
 can no longer see who replied to a thread. All I see is the handle of the
 original poster. I can see the reply counts, but have no idea who the
 replies are from.

 Add both of these issues together, and the list has now become very
 impersonal.

 Cheers,

 Ivan
  On May 5, 2015 10:49 AM, Leslie Hawthorn 
 leslie.hawth...@elastic.co wrote:

 Sadly, we cannot twibble bit to allow certain types of links but not
 others.

 However, we can adjust the forum settings to allow users to include
 links in their posts from the start of using the forum. I've done so. 
 Let
 us know if you have any further issues.

 And, to reiterate what Tyler said earlier, it would be super

Re: Forums Are Now Live at http://discuss.elastic.co

2015-05-10 Thread Ivan Brusic

I should have added something similar to what does expressed in his last
paragraph. My feedback was meant to be constructive. Despite this being a
technical mailing list, I still appreciate a more personal touch.

BTW, Doug, you can watch specific categories. There is a general watch-all
setting that you might have turned on.

Ivan
On May 10, 2015 2:37 PM, Doug Turnbull 
dturnb...@opensourceconnections.com wrote:

 I agree.

 While I appreciate the experimentation, I hope discourse can get to a
 point where I can forget it's even a forum. Whereby everything can just
 truly feel like a native mailing list with a great deal less friction, but
 it's not there yet.

 Here's some of the missing features:

 - a signup process that's more an email subscription (just paste in an
 email address) without having to create any kind of forum account or
 profile.

 - the ability to reply to someone individually to their email address. The
 ability for others to email me directly without discourses PM feature.

 - It needs the ability to subscribe per topic. Things are a bit noisy
 right now (ie I'm not interested in log stash)

 - eliminate noreply in the email notifications. I don't feel like I
 should participate via email when I see this. Give it a friendly name.

 Right now use via email feels second class. But I think it's the most
 important thing. I'm likely to scan my low priority inbox where mailing
 list emails are sent. I'm going to struggle to remember to check in on and
 participate in a forum to help folks. It's another place to go and all my
 other OSS mailing lists come to my email and I can work with them
 seemlessly. So I'm likely to forget to check or possibly not want to bother
 with elastic which uses a different system.

 Yes, I do get the notifications, but it doesn't quite feel the same as a
 mailing list for the reasons above. It feels like a notification from
 another system.

 Anyway long and frank email. Forgive the bluntness. I just wanted to
 express hopefully useful feedback. I do appreciate the thoughtfulness
 here. I know elastic and discourse folks are very smart. Email truly can
 become a first class experience and keep some of the great things about
 discourse.

 Cheers!
 Doug

 On Sunday, May 10, 2015, Ivan Brusic i...@brusic.com wrote:

 I really do not care for the new mailing list.

 First of all, I can no longer see real names and email addresses. All I
 see is whatever nonsensical handle someone choose on sign up. Searching for
 Adrien no longer returns his latest posts.

 Second, since every email comes from nore...@discuss.elastic.co, I can
 no longer see who replied to a thread. All I see is the handle of the
 original poster. I can see the reply counts, but have no idea who the
 replies are from.

 Add both of these issues together, and the list has now become very
 impersonal.

 Cheers,

 Ivan
  On May 5, 2015 10:49 AM, Leslie Hawthorn leslie.hawth...@elastic.co
 wrote:

 Sadly, we cannot twibble bit to allow certain types of links but not
 others.

 However, we can adjust the forum settings to allow users to include
 links in their posts from the start of using the forum. I've done so. Let
 us know if you have any further issues.

 And, to reiterate what Tyler said earlier, it would be super awesome to
 put this feedback in the Meta Elastic category.[0]

 Why you may ask?

 1) Employees from Discourse are keeping an eye on our forum to help us
 help you have a good experience. They're not reading this list, though.

 2) Trying for the single source of truth mentioned in my original post.

 Thank you to everyone for the great feedback so far!

 [0] - https://discuss.elastic.co/c/meta or via email discuss+meta [at]
 elastic [dot] co

 Cheers,
 LH

 On Tue, May 5, 2015 at 12:57 AM, Nikolas Everett nik9...@gmail.com
 wrote:

 I think github and pastebins/gists shouldn't be considered against the
 limit. We ask people to use gist all the time and github issue or code
 links are a good thing to use as well.
 On May 4, 2015 5:40 PM, joergpra...@gmail.com joergpra...@gmail.com
 wrote:

 Thanks Shaunak,

 I appreciate that. I think it would be more than welcome to let others
 of the community also take the advantage of including Github issues into
 the forum software which contain numerous links:

 https://discuss.elastic.co/t/link-level-in-a-post/151

 Best,

 Jörg


 On Mon, May 4, 2015 at 11:05 PM, shau...@elastic.co wrote:

 Hey Jörg,

 I've removed this restriction from your account. You should be able
 to post more than 2 links in a post now :)

 Shaunak

 On Monday, May 4, 2015 at 2:07:14 PM UTC-5, Jörg Prante wrote:

 It does not work. I can not post messages with links.

 After I try to post a new topic such as

 - snip
 To all of you who want to sneak at the features planned for ES 2.0,
 this issue collects some of it

 https://github.com/elastic/elasticsearch/issues/9970

 Best,

 Jörg
 snip

 I receive a denial

 Sorry, new users can only put 2 links

Re: Forums Are Now Live at http://discuss.elastic.co

2015-05-05 Thread Ivan Brusic

I am watching a few select categories with email notifications, but I still
received notifications for other categories, Logstash in my case.

Ivan
On May 4, 2015 6:12 PM, leslie.hawthorn leslie.hawth...@elastic.co
wrote:

Hello everyone,

We took in feedback on moving to a Discourse based forum for about a
month, and it sounds like most of the folks who thought it might not be
optimal were people who preferred to interact with mailing lists instead of
forums.

We're pretty confident the email functionality of Discourse will work well
for our community, so we've gone ahead and rolled out the forums. You can
visit them now and sign up for a user account at http://discuss.elastic.co.
Registration is one time only and you can do so with any email address or
authorizing via Facebook, GitHub, Google Accounts or Twitter.

Once you've created your account, you can set up your preferences to
receive email as often or as rarely as you would like.

For those who'd prefer to interact solely via email, you can start doing
so as soon as you've set these preferences. You can find full documentation
on interacting with the forums solely via email here.[0]

Some anticipated FAQs:

1) Should I ask for help using the forums on this mailing list?
You're welcome to ask for help on this list, on IRC (#elasticsearch,
#logstash or #kibana on Freenode) or within the forum in the Meta Elastic
category.

2) Are you going to stop answering questions here?
No, we're going to leave the mailing list active for at least 30 days so
our entire community can kick the tires. Quoting our internal company FAQ
for this very question:
If our community members find Discourse to be suboptimal, we may make
the choice to use the forums only for certain tasks and preserve the
existing mailing lists.

However, our goal is to provide a single source of truth for information
on our products, so we're hoping the Discourse based forums really work for
people. Our employees will be answering your questions with a pointer to a
forum thread to encourage people to actually interact with the new
resource, too.

3) So let's say that the forums work for people. What's the plan?
We'll set these mailing lists to read-only ~30 days from now (June 1,
2015). We're still working with the creators of Discourse to do a full
import of our mailing list archives, so once that task is complete the
read-only archives will still be preserved, but you can search through all
that collective knowledge in one place at http://discuss.elastic.co

4) Is there an FAQ for using the forums?
We feel like the user interface for Discourse is pretty intuitive, but we
have a short Notes on Using This Forum document [1] available to help you
get started.

5) Where do I direct praise and pain points for the forums?
Feedback on the forums should be posted to the Meta Elastic category. [2]
You can update it via email using address discuss-meta [at] elastic [dot]
co once you've set up your user account.

6) I have another question that you have not answered. What should I do?
Please post a note in the Meta Elastic category, ping in IRC (I'm lh on
Freenode, though anyone with ChanOps/Voice in #elasticsearch, #logstash and
#kibana can help) or post a note in this thread.

[0] -
https://discuss.elastic.co/t/email-only-interaction-with-the-forums/106
[1] - https://discuss.elastic.co/t/notes-on-using-these-forums/118
[2] - https://discuss.elastic.co/c/meta

Cheers,
LH

Leslie Hawthorn
Director of Developer Relations
http://elastic.co

Other Places to Find Me:
Freenode: lh
Twitter: @lhawthorn

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/1c879411-ee8a-44ac-9e91-54843561baa8%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/1c879411-ee8a-44ac-9e91-54843561baa8%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQCUMTJ6oWq9r_a36bkf-GKV0Fdpb4THt2RQuuo27nMq4Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Forums Are Now Live at http://discuss.elastic.co

2015-05-05 Thread Ivan Brusic

I might have found the conflicting setting.
On May 5, 2015 9:43 AM, Ivan Brusic i...@brusic.com wrote:

I am watching a few select categories with email notifications, but I
still received notifications for other categories, Logstash in my case.

Ivan
On May 4, 2015 6:12 PM, leslie.hawthorn leslie.hawth...@elastic.co
wrote:

Hello everyone,

We're pretty confident the email functionality of Discourse will work
well for our community, so we've gone ahead and rolled out the forums. You
can visit them now and sign up for a user account at
http://discuss.elastic.co. Registration is one time only and you can do
so with any email address or authorizing via Facebook, GitHub, Google
Accounts or Twitter.

Once you've created your account, you can set up your preferences to
receive email as often or as rarely as you would like.

Some anticipated FAQs:

[0] -
https://discuss.elastic.co/t/email-only-interaction-with-the-forums/106
[1] - https://discuss.elastic.co/t/notes-on-using-these-forums/118
[2] - https://discuss.elastic.co/c/meta

Cheers,
LH

Leslie Hawthorn
Director of Developer Relations
http://elastic.co

Other Places to Find Me:
Freenode: lh
Twitter: @lhawthorn

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQCQKER_ceK%2B8kt3hKOWxucsXdfJ5OHTee-qwccm076DFQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Split brain problem in 2 node elasticsearch cluster

2015-05-05 Thread Ivan Brusic

In non big data scenarios, having two servers for a database is simply
done to achieve high availability. Most databases use a master client
scenario, but Elasticsearch does not support such a setup. It really should
because not everyone has tons of data.

Ivan, not affiliated with the OP
On May 4, 2015 8:32 AM, Jason Wee peich...@gmail.com wrote:

why must you have only two nodes, would it be possible to add one more
nodes so split brain will not become an issue?

jason

On Mon, May 4, 2015 at 2:20 PM, Mark Walkom markwal...@gmail.com wrote:

Your nodes aren't in different DCs are they? If so this is why we don't
support such setups, because ES is latency sensitive and these sorts of
things can happen very easily when your network is unreliable.

They don't try to ping other nodes because you only have two, and if they
lose contact with one another then they both assume they are masters and
create their own cluster. Masters don't ping other nodes at random and see
if they should be joining a different cluster.

Logically there is no difference between a primary and a replica shard,
the only physical difference is a flag that tells the cluster state which
is which. This is why ES will never assign a primary and it's applicable
replica to the same node.

You cannot get around the root of your problem unless you add another
node to and set min masters to ensure a majority quorum.

On 4 May 2015 at 15:27, Gourav H Dhelaria gouravdhela...@gmail.com
wrote:

1) After network goes down, they loose communication with each other.
After that, they are becoming split.
2) They both think they are masters. Even if they think they are
masters, shouldn't the ping happen to see if there are other nodes in the
cluster ?
3) Number of replicas is set to 1. If ES doesn't differentiate, why are
some shards primary and others replica ?

On Monday, 4 May 2015 10:48:24 UTC+5:30, Mark Walkom wrote:

1. Why are they becoming split anyway? GC, other load, network?
2. Not if they both think they are masters.
3. Are you running replicas? If so ES doesn't really differentiate
between the two.

On 4 May 2015 at 15:03, Gourav H Dhelaria gouravd...@gmail.com wrote:

Version: 1.4.
Say there are 2 nodes X and Y, both capable of becoming master.
When network goes down, both nodes get disconnected from each other
and assume the responsibility of master.
When network is restored, they don't ping each other and form a
cluster.

Elasticsearch service has to be restarted on any one of the nodes for
them to form a cluster. Even after they form a cluster, all primary shards
remain on one node ( on which the service was restarted ), and all replica
shards are on the other node.

This document

http://www.elastic.co/guide/en/elasticsearch/guide/current/_important_configuration_changes.html

mentions that there has to be an uneven number of master eligible
nodes.

Queries:

1) Is there a way of avoiding split brain problem in 2 node cluster ?

2) After network is restored, shouldn't the nodes ping each other and
form a cluster ?

3) After the service is restarted to form the cluster, why don't the
primary shards get distributed on both the nodes ?

Thanks,

Gourav

--
You received this message because you are subscribed to the Google
Groups elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/d15234b3-0ea1-4390-b136-2f02f69cd3f5%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/d15234b3-0ea1-4390-b136-2f02f69cd3f5%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google
Groups elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/bf7a9953-bc87-4b96-843d-7bff5899855f%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/bf7a9953-bc87-4b96-843d-7bff5899855f%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X8nc1A_vmQ_vyG4fq2uNqFA9kZO%2BT_Y4ed0e6wwPM7ztA%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X8nc1A_vmQ_vyG4fq2uNqFA9kZO%2BT_Y4ed0e6wwPM7ztA%40mail.gmail.com?utm_medium=emailutm_source=footer
.
For more options, visit

Re: Field-length norm fails on fields with 3 and 4 words

2015-04-30 Thread Ivan Brusic

The field norm is computed at index time and is stored in a single byte,
which can lead to a loss in precision. This behavior might have changed
with newer versions of Lucene, but probably not.

Ivan
On Apr 30, 2015 6:42 PM, Fil ES lisowski.fili...@gmail.com wrote:

Hello,

I am experiencing an very annoying behaviour of the elastic search score
calculating algorithm - the field length fails to find a difference between
fields which contain 3 and 4 words. Always return same score for both.
Example:

LANCA HOTEL EXTREME and MASSIVE AMAZING HOTEL GROUP

would come back with the same field length and set the same score for
field-length norm.

I did try using BM25 similarity instead of default one manipulating
parameters, however the output would be always the same.

Anybody got any idea why that would be happening? It is extremely annoying
as most of fields in each document contain about 3-4 words.

Thank you,
Fil

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/a007c1fc-a5c4-45f5-9f83-7f414831170b%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/a007c1fc-a5c4-45f5-9f83-7f414831170b%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQA2qwW9RAJ9NM_9kvWzfPkF7qxFHuLZaxGOphj%2BvjLA6A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: How to replicate this type of search

2015-04-30 Thread Ivan Brusic

Although the syntax is straight Lucene (query string query), I suspect that
Github and other sites parse the query term to create a format similar to
the one John mentioned.

Cheers,

Ivan
On May 1, 2015 1:22 AM, Peter Sorensen peter.jens.soren...@gmail.com
wrote:

Sorry for the vague title. If I knew what to call what I was looking for,
I'd have a much easier time finding it!

Anyways, I often see sites using filters right in the query box. For
instance, on Github, you can see open issues by typing ` is:open: is:issue
{search term} `

What element of ES is used to perform this?

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/856fd1f6-e8eb-460b-abf5-ca70eb10a22f%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/856fd1f6-e8eb-460b-abf5-ca70eb10a22f%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQC0Zkfwsy7JKuvVmRwJE8HuHRWU_%2BzZGtf%2BdjYAejMPEQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Apply word_delimiter token filter on words having 5 chars or more.

2015-04-24 Thread Ivan Brusic

Your best option would be to write your own filter. It should be easy since
you have access to the source of the delimiter and length filters. Look at
the existing filter plugins for examples on how to deploy.

Ivan
On Apr 24, 2015 10:39 AM, Nassim nassim.ka...@gmail.com wrote:

 Hi,

 Is it possibile to apply the word_delimiter token filter only on words
 having 5 chars or more ?

 Thank you.

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/e69d8c2b-1981-407f-b71d-9cc4fb69213d%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/e69d8c2b-1981-407f-b71d-9cc4fb69213d%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQDaO_or39PgyqATHAPZ71tRJbjLFuSwyY941uFvU%3DWyVg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: I have got a little Problem with my synonym filter ....

2015-04-21 Thread Ivan Brusic

What kind of query are you executing? Are you query against a specific
field? A match query against the title field should work.

When using the analyze API, explicit state the field and not the analyzer
for more accurate behavior of what really goes on.

Cheers,

Ivan
On Apr 21, 2015 11:40 AM, Ste Phan stephan.sk...@gmail.com wrote:

 *... I build a little sample of what I do.*

 *My Test Synonyms file is (test.syn placed into my /etc/elasticsearch
 folder):*

 aaa,bbb,ccc,ddd
 www,xxx,yyy,zzz
 eee,fff,ggg,hhh = 111
 sss,ttt,uuu,vvv = 222
 rrr = 333,444,555

 *I created an index like so:*

 PUT /testindex?pretty
 {
 settings: {
 analysis: {
 analyzer: {
 myIndexAnalyzer: {
 tokenizer: standard,
 filter: [
 lowercase,
 mySynonymsFilter
 ]
 },
 mySearchAnalyzer: {
 tokenizer: standard,
 filter: [
 lowercase
 ]
 }
 },
 filter: {
 mySynonymsFilter: {
 type: synonym,
 ignore_case: true,
 synonyms_path: test.syn
 }
 }
 }
 },
 mappings: {
 testitem: {
 properties: {
 title: {
 type: string,
 index_analyzer: myIndexAnalyzer,
 search_analyzer: mySearchAnalyzer
 }
 }
 }
 }
 }

 *and added some data:*

 POST /_bulk
 { index: { _index: testindex, _type: testitem, _id: 1 }}
 { title:aaa test daten eintrag. }
 { index: { _index: testindex, _type: testitem, _id: 2 }}
 { title:bbb test daten eintrag. }
 { index: { _index: testindex, _type: testitem, _id: 3 }}
 { title:eee test daten eintrag. }

 *Testing the myIndexAnalyzer using*

 POST /testindex/_analyze?analyzer=myIndexAnalyzerpretty
 {aaa test daten eintrag}

 *Results to:*

 {
tokens: [
   {
  token: aaa,
  start_offset: 1,
  end_offset: 4,
  type: SYNONYM,
  position: 1
   },
   {
  token: bbb,
  start_offset: 1,
  end_offset: 4,
  type: SYNONYM,
  position: 1
   },
   {
  token: ccc,
  start_offset: 1,
  end_offset: 4,
  type: SYNONYM,
  position: 1
   },
   {
  token: ddd,
  start_offset: 1,
  end_offset: 4,
  type: SYNONYM,
  position: 1
   },
   {
  token: test,
  start_offset: 5,
  end_offset: 9,
  type: ALPHANUM,
  position: 2
   }
]
 }

 *Which to me seems to be fine.*

 Searching this index, i expected to find Record Ids 1 and 2 if I am
 searching for aaa, bbb, ccc, ddd.

 Which is my fault??

 TIA
 Ste Phan


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/64e90076-f905-4490-bfe8-3b1607e5e98a%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/64e90076-f905-4490-bfe8-3b1607e5e98a%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQB6A9nq1GC52sdugQx1%2BM_pJJvdo6ti0ofQYfbOqK6P2A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Evaluating Moving to Discourse - Feedback Wanted

2015-04-20 Thread Ivan Brusic

I believe the best developers are cynics.  Never trust someone else's code,
that API, the OS, etc :)

What bothers me about Discourse is that email is an afterthought. They have
not built out that feature yet?  For me, and apparently many others,  email
is the first concern.

The transition is understandable if you want to transition from a closed
system. The other reason, not enough fragmentation, is worrisome. If no one
uses the logstash list,  that is a problem with the site/documentation, not
the mailing list itself. I cringe at the thought of an Elasticsearch forum
with a dozen subforums.

Ivan
On Apr 15, 2015 7:21 PM, Leslie Hawthorn leslie.hawth...@elastic.co
wrote:



 On Wed, Apr 15, 2015 at 9:02 AM, Ivan Brusic i...@brusic.com wrote:

 I should clarify that I have no issues moving to Discourse, as long as
 instantaneous email interaction is preserved, just wanted to point out that
 I see no issues with the mailing lists.

 Understood.

 The question is moot anyways since the change will happen regardless of
 our inputs.

 Actually, I'm maintaining our Forums pre-launch checklist where there's a
 line item for don't move forward based on community feedback. I
 respectfully disagree with your assessment that the change will happen
 regardless of input from the community. We asked for feedback for a reason.
 :)

 I hope we can subscribe to Discourse mailing lists without needing an
 account.

 You'll need an account, but it's a one-time login to set up your
 preferences and then read/interact solely via email.

 Cheers,
 LH

 Cheers,

 Ivan
 On Apr 13, 2015 7:13 PM, Leslie Hawthorn leslie.hawth...@elastic.co
 wrote:

 Thanks for your feedback, Ivan.

 There's no plan to remove threads from the forums, so information would
 always be archived there as well.

 Does that impact your thoughts on moving to Discourse?

 Folks, please keep the feedback coming!

 Cheers,
 LH

 On Sat, Apr 11, 2015 at 12:09 AM, Ivan Brusic i...@brusic.com wrote:

 As one of the oldest and most frequent users (before my sabbatical) of
 the mailing list, I just wanted to say that I never had an issue with it.
 It works. As long as I could continue using only email, I am happy.

 For realtime communication, there is the IRC channel. If prefer the
 mailing list since everything is archived.

 Ivan
  On Apr 2, 2015 5:36 PM, leslie.hawthorn leslie.hawth...@elastic.co
 wrote:

 Hello everyone,

 As we’ve begun to scale up development on three different open source
 projects, we’ve found Google Groups to be a difficult solution for dealing
 with all of our needs for community support. We’ve got multiple mailing
 lists going, which can be confusing for new folks trying to figure out
 where to go to ask a question.

 We’ve also found our lists are becoming noisy in the “good problem to
 have” kind of way. As we’ve seen more user adoption, and across such a 
 wide
 variety of use cases, we’re getting widely different types of questions
 asked. For example, I can imagine that folks not using our Python client
 would rather not be distracted with emails about it.

 There’s also a few other strikes against Groups as a tool, such as the
 fact that it is no longer a supported product by Google, it provides no 
 API
 hooks and it is not available for users in China.

 We’ve evaluated several options and we’re currently considering
 shuttering the elasticsearch-user and logstash-users Google Groups in 
 favor
 of a Discourse forum. You can read more about Discourse at
 http://www.discourse.org

 We feel Discourse will allow us to provide a better experience for all
 of our users for a few reasons:

 * More fine grained conversation topics = less noise and better
 targeted discussions. e.g. we can offer a forum for each language client,
 individual logstash plugin or for each city to plan user group meetings,
 etc.

 * Facilitates discussions that are not generally happening on list
 now, such as best practices by use case or tips from moving to development
 to production

 * Easier for folks who are purely end users - and less used to getting
 peer support on a mailing list - to get help when they need it

 Obviously, Discourse does not function the exact same way as a mailing
 list - however, email interaction with Discourse is supported and will
 continue to allow you to participate in discussions over email (though
 there are some small issues related to in-line replies. [0])

 We’re working with the Discourse team now as part of evaluating this
 transition, and we know they’re working to resolve this particular issue.
 We’re also still determining how Discourse will handle our needs for both
 user and list archive migration, and we’ll know the precise details of how
 that would work soon. (We’ll share when we have them.)

 The final goal would be to move Google Groups to read-only archives,
 and cut over to Discourse completely for community support discussions.

 We’re looking at making the cut over in ~30 days from today

Re: Aggregation not limited to filter?

2015-04-15 Thread Ivan Brusic

Which version are you using! The old post filter methods simply named
filter, should have been removed, or at least deprecated.

Cheers,

Ivan
On Apr 13, 2015 1:33 PM, James Green james.mk.gr...@gmail.com wrote:

Indeed. I had used postFilter to add my filters. The documentation for
filters doesn't show how to use a query with a matchAll and a bunch of
filters so I blindly followed IDE auto-complete.

Lesson learned.

On 10 April 2015 at 21:17, James Macdonald james.macdon...@geofeedia.com
wrote:

I had a similar problem recently and solved it by moving my filter into a
filtered query (leaving the query as a match_all), see documentation here
http://www.elastic.co/guide/en/elasticsearch/reference/1.5/query-dsl-filtered-query.html
.

I am not certain why filters do not restrict the scope of the aggregates,
but queries do, but I suspect it interprets the filter (not wrapped in a
filtered_query) as a post_filter (
http://www.elastic.co/guide/en/elasticsearch/reference/1.x/search-request-post-filter.html).
Maybe someone else actually knows why.

Hope that helps,
James

On Fri, Apr 10, 2015 at 11:39 AM, James Green james.mk.gr...@gmail.com
wrote:

I must be doing something stupid!

Using the Java client I can perform a search with a filter and iterate
over the hits. I see exactly the right source documents.

If I add an aggregation, I see the expected keyAsText string but the
docCount reflects the volume if the filter had not been applied.

I expected the aggregation to be restricted to the results within that
filter?

Thanks,

James

--
You received this message because you are subscribed to the Google
Groups elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAMH6%2BaxkmZVfDhkJW-bWPrRs5BMzTem-2zCQRWeF%2BLQCR2L5sA%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAMH6%2BaxkmZVfDhkJW-bWPrRs5BMzTem-2zCQRWeF%2BLQCR2L5sA%40mail.gmail.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAABsnTbD0JgcpMCMWuzjVC1W3C-pt6pC6PJG0xT31O44MZQs%3DA%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAABsnTbD0JgcpMCMWuzjVC1W3C-pt6pC6PJG0xT31O44MZQs%3DA%40mail.gmail.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAMH6%2BaxDfHvCicw5rewNOAun5Vy2qZe8X_awGD3wR8B-vVZY-A%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAMH6%2BaxDfHvCicw5rewNOAun5Vy2qZe8X_awGD3wR8B-vVZY-A%40mail.gmail.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQCy8fZvnKZpuGFJMWXvt9MFQdUQzFO8au77mZj7r3VW0A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Evaluating Moving to Discourse - Feedback Wanted

2015-04-15 Thread Ivan Brusic

I should clarify that I have no issues moving to Discourse, as long as
instantaneous email interaction is preserved, just wanted to point out that
I see no issues with the mailing lists.

The question is moot anyways since the change will happen regardless of our
inputs.

I hope we can subscribe to Discourse mailing lists without needing an
account.

Cheers,

Ivan
On Apr 13, 2015 7:13 PM, Leslie Hawthorn leslie.hawth...@elastic.co
wrote:

 Thanks for your feedback, Ivan.

 There's no plan to remove threads from the forums, so information would
 always be archived there as well.

 Does that impact your thoughts on moving to Discourse?

 Folks, please keep the feedback coming!

 Cheers,
 LH

 On Sat, Apr 11, 2015 at 12:09 AM, Ivan Brusic i...@brusic.com wrote:

 As one of the oldest and most frequent users (before my sabbatical) of
 the mailing list, I just wanted to say that I never had an issue with it.
 It works. As long as I could continue using only email, I am happy.

 For realtime communication, there is the IRC channel. If prefer the
 mailing list since everything is archived.

 Ivan
  On Apr 2, 2015 5:36 PM, leslie.hawthorn leslie.hawth...@elastic.co
 wrote:

 Hello everyone,

 As we’ve begun to scale up development on three different open source
 projects, we’ve found Google Groups to be a difficult solution for dealing
 with all of our needs for community support. We’ve got multiple mailing
 lists going, which can be confusing for new folks trying to figure out
 where to go to ask a question.

 We’ve also found our lists are becoming noisy in the “good problem to
 have” kind of way. As we’ve seen more user adoption, and across such a wide
 variety of use cases, we’re getting widely different types of questions
 asked. For example, I can imagine that folks not using our Python client
 would rather not be distracted with emails about it.

 There’s also a few other strikes against Groups as a tool, such as the
 fact that it is no longer a supported product by Google, it provides no API
 hooks and it is not available for users in China.

 We’ve evaluated several options and we’re currently considering
 shuttering the elasticsearch-user and logstash-users Google Groups in favor
 of a Discourse forum. You can read more about Discourse at
 http://www.discourse.org

 We feel Discourse will allow us to provide a better experience for all
 of our users for a few reasons:

 * More fine grained conversation topics = less noise and better targeted
 discussions. e.g. we can offer a forum for each language client, individual
 logstash plugin or for each city to plan user group meetings, etc.

 * Facilitates discussions that are not generally happening on list now,
 such as best practices by use case or tips from moving to development to
 production

 * Easier for folks who are purely end users - and less used to getting
 peer support on a mailing list - to get help when they need it

 Obviously, Discourse does not function the exact same way as a mailing
 list - however, email interaction with Discourse is supported and will
 continue to allow you to participate in discussions over email (though
 there are some small issues related to in-line replies. [0])

 We’re working with the Discourse team now as part of evaluating this
 transition, and we know they’re working to resolve this particular issue.
 We’re also still determining how Discourse will handle our needs for both
 user and list archive migration, and we’ll know the precise details of how
 that would work soon. (We’ll share when we have them.)

 The final goal would be to move Google Groups to read-only archives, and
 cut over to Discourse completely for community support discussions.

 We’re looking at making the cut over in ~30 days from today, but
 obviously that’s subject to the feedback we receive from all of you. We’re
 sharing this information to set expectations about time frame for making
 the switch. It’s not set in stone. Our highest priority is to ensure
 effective migration of our list archives and subscribers, which may mean a
 longer time horizon for deploying Discourse, as well.

 In the meantime, though, we wanted to communicate early and often and
 get your feedback. Would this change make your life better? Worse? Meh?

 Please share your thoughts with us so we can evaluate your feedback. We
 don’t take this switch lightly, and we want to understand how it will
 impact your overall workflow and experience.

 We’ll make regular updates to the list responding to incoming feedback
 and be completely transparent about how our thought processes evolve based
 on it.

 Thanks in advance!

 [0] - https://meta.discourse.org/t/migrating-from-google-groups/24695


 Cheers,
 LH

 Leslie Hawthorn
 Director of Developer Relations
 http://elastic.co


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch

Re: BM25 for query itself

2015-04-15 Thread Ivan Brusic

Isn't the point of BM25 to use variable document length normalization? It
works when used on the entire index/corpus. It is meant to influence the TF
values.

Comparing results between Lucene queries is not advisable. Why did you
switch to BM25? Do you field lengths vary much?

Cheers,

Ivan
On Apr 15, 2015 3:14 AM, bohdan bkle...@gmail.com wrote:

Hi,

I'm wondering is there away to calculate BM25 score for the query itself
(query-against-query)?
Adding it to index seems to be invalid solution as it will influent the
tf-idf of the index and make it corrupted.

Thank you,
Bohdan

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/2183d2cc-a997-4397-aa0b-c5ac2fcedd71%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/2183d2cc-a997-4397-aa0b-c5ac2fcedd71%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQBkoZPJa%2Ba6urcjHtP3xeYHHP5Jb39cFFXPLS09PACuPQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Querystring search: Tokens are out of order

2015-04-15 Thread Ivan Brusic

You understanding is correct. The former will be translated into a Lucene
phrase query, which uses the term doc positions to find matches.

Both query terms are analyzed, but the latter will simply be a bag-of-words
query, which ignores positions.

Cheers,

Ivan
On Apr 14, 2015 10:38 PM, Dave Reed infinit...@gmail.com wrote:

To perhaps answer my own question, I think I understand the difference.

details:foo bar

Would search for the tokens in the same order (implied by the docs I
referenced). But

details:foo-bar

Would not honor the order. The quotes have more meaning than to enclose
the phrase... if that is true then these two queries are not the same,
which is different than I thought:

details:foo\ bar
!=
details:foo bar

Or am I barking up the wrong tree...

On Tuesday, April 14, 2015 at 1:34:28 PM UTC-7, Dave Reed wrote:

Thanks, though unless I am misunderstanding it, the docs imply otherwise:

For example, from:
http://www.elastic.co/guide/en/elasticsearch/reference/
current/query-dsl-query-string-query.html

The query string is parsed into a series of *terms* and *operators*. A
term can be a single word — quick or brown — or a phrase, surrounded by
double quotes — quick brown — which searches for all the words in the
phrase, in the same order.

So what gives? :)

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/b28591e3-3818-4b12-8a22-cac466c9ec7c%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/b28591e3-3818-4b12-8a22-cac466c9ec7c%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQBZOjqZ6xU8Y2%3Dh6BmBWOqms53yrix5eJsWXq9E6meYbg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Elasticsearch Upgrade to Version 1.4.4

2015-04-12 Thread Ivan Brusic

In my experience, the client can be older than the server.* The server side
code contains many version checks, so it should know how to handle requests
from older clients. The inverse is much harder to support since clients do
not change their requests based on the server.

* Between minor versions. It should work between major versions as well,
but I would not chance it.

Ivan
On Apr 12, 2015 3:15 PM, Costya Regev cos...@totango.com wrote:

Hi ,

We are running on Elasticsearch Version 1.4.2 , Our client version is
1.4.1.

We want to upgrade to Es version 1.4.4 do we need to upgrade the Client
version in order to achive the upgrdae ?

Thanks,

Costya.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/8bcaed4c-d660-495f-99a3-edf05e66013a%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/8bcaed4c-d660-495f-99a3-edf05e66013a%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQCxYn8p%3DVyVXX86eNk90_-Hn3QicCCKexeGGdHHk%2BgP9g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Evaluating Moving to Discourse - Feedback Wanted

2015-04-10 Thread Ivan Brusic

As one of the oldest and most frequent users (before my sabbatical) of the
mailing list, I just wanted to say that I never had an issue with it. It
works. As long as I could continue using only email, I am happy.

For realtime communication, there is the IRC channel. If prefer the mailing
list since everything is archived.

Ivan
On Apr 2, 2015 5:36 PM, leslie.hawthorn leslie.hawth...@elastic.co
wrote:

Hello everyone,

As we’ve begun to scale up development on three different open source
projects, we’ve found Google Groups to be a difficult solution for dealing
with all of our needs for community support. We’ve got multiple mailing
lists going, which can be confusing for new folks trying to figure out
where to go to ask a question.

We’ve also found our lists are becoming noisy in the “good problem to
have” kind of way. As we’ve seen more user adoption, and across such a wide
variety of use cases, we’re getting widely different types of questions
asked. For example, I can imagine that folks not using our Python client
would rather not be distracted with emails about it.

There’s also a few other strikes against Groups as a tool, such as the
fact that it is no longer a supported product by Google, it provides no API
hooks and it is not available for users in China.

We’ve evaluated several options and we’re currently considering shuttering
the elasticsearch-user and logstash-users Google Groups in favor of a
Discourse forum. You can read more about Discourse at
http://www.discourse.org

We feel Discourse will allow us to provide a better experience for all of
our users for a few reasons:

* More fine grained conversation topics = less noise and better targeted
discussions. e.g. we can offer a forum for each language client, individual
logstash plugin or for each city to plan user group meetings, etc.

* Facilitates discussions that are not generally happening on list now,
such as best practices by use case or tips from moving to development to
production

* Easier for folks who are purely end users - and less used to getting
peer support on a mailing list - to get help when they need it

Obviously, Discourse does not function the exact same way as a mailing
list - however, email interaction with Discourse is supported and will
continue to allow you to participate in discussions over email (though
there are some small issues related to in-line replies. [0])

We’re working with the Discourse team now as part of evaluating this
transition, and we know they’re working to resolve this particular issue.
We’re also still determining how Discourse will handle our needs for both
user and list archive migration, and we’ll know the precise details of how
that would work soon. (We’ll share when we have them.)

The final goal would be to move Google Groups to read-only archives, and
cut over to Discourse completely for community support discussions.

We’re looking at making the cut over in ~30 days from today, but obviously
that’s subject to the feedback we receive from all of you. We’re sharing
this information to set expectations about time frame for making the
switch. It’s not set in stone. Our highest priority is to ensure effective
migration of our list archives and subscribers, which may mean a longer
time horizon for deploying Discourse, as well.

In the meantime, though, we wanted to communicate early and often and get
your feedback. Would this change make your life better? Worse? Meh?

Please share your thoughts with us so we can evaluate your feedback. We
don’t take this switch lightly, and we want to understand how it will
impact your overall workflow and experience.

We’ll make regular updates to the list responding to incoming feedback and
be completely transparent about how our thought processes evolve based on
it.

Thanks in advance!

[0] - https://meta.discourse.org/t/migrating-from-google-groups/24695

Cheers,
LH

Leslie Hawthorn
Director of Developer Relations
http://elastic.co

Other Places to Find Me:
Freenode: lh
Twitter: @lhawthorn

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/a0c02cd4-231d-4ce5-adb7-7164de76a902%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/a0c02cd4-231d-4ce5-adb7-7164de76a902%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

Re: elastic.co blog RSS URL missing

2015-03-26 Thread Ivan Brusic

I noticed the same thing. The link is redirecting for me, but my reader
(AOL Reader) appears not to handle redirects.

Ivan
On Mar 25, 2015 9:11 AM, Magnus Bäck magnus.b...@sonymobile.com wrote:

The not too widely announced move from elasticsearch.(com|org) to
elastic.co the other week seems to have broken the old Elasticsearch
blog RSS feed, and I can’t find the RSS URL for the replacement
elastic.co blog. Please say there is one.

A final post to the old blog referring to the new one would've been
nice. I'm probably not the only one who's missed the updates from
the last week or two.

--
Magnus Bäck| Software Engineer, Development Tools
magnus.b...@sonymobile.com | Sony Mobile Communications

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/20150325071127.GA22589%40seldlx20533.corpusers.net
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQAyYcjKa8XfFBQCz7w34_GrZBnRDtA7uZKJARAZDXMsdA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Manually adjusting document sort based on queries

2015-03-22 Thread Ivan Brusic

Easiest option, in terms of complexity, would probably be to use a bool
query, where product x and y are matched by an id query with high boosts.

Best option is probably the function score query:
http://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-function-score-query.html

Create a filter for each product and replace the score with something
arbitrarly high.

You could get detailed by using a custom script, but it will not be as
efficient.

Cheers,

Ivan
On Mar 21, 2015 7:39 PM, Zelfapp n...@usamm.com wrote:

For specific queries I want to be able to manually set the sort of some
search results. Without going into the reason why, how do I implement this?
I'm not talking about boosting, I'm talking about manually setting the sort
position based on specific query.

E.g. if users searches for Green Military Shirts I want to set product X
to position 1, product J to position 2. All other search results would
sorted as usual.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/2757e0a1-ca5a-498e-89e2-02cf3dad9e0e%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/2757e0a1-ca5a-498e-89e2-02cf3dad9e0e%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQCibunCj15nHP0vNm8gg0U7qukDL%3DUKrmahx-OR9KLhjQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Is it possible to write my own filter ?

2015-03-12 Thread Ivan Brusic

Off the top of my head, I cannot think of an existing filter that
accomplishes that task.

Creating a custom filter is easy. Simply creating a Lucene filter and
create a plug-in around it. Take a look at existing analysis plug-ins for
inspiration.

http://www.elastic.co/guide/en/elasticsearch/reference/current/modules-plugins.html#analysis-plugins

Cheers,

Ivan
On Mar 12, 2015 11:43 AM, cornet.r...@gmail.com wrote:

Hi everyone,

I need a filter to split in two words a word containing a suffix that
belongs to a list (Maybe a text file containing all the suffix) but I can't
find an existing filter doing that.

Does anyone have a solution to this?
If not, is there a way to write my own filter in Java and add it to
ElasticSearch ? : )

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/ef12f3ec-1210-4890-8f52-49cb5d7243d1%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/ef12f3ec-1210-4890-8f52-49cb5d7243d1%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQB9QcgC-d%3DkE36U04k9_S1QrzdZbEj_%3Dk2UCtrOSz8b3A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: re-use zen discovery API

2015-03-11 Thread Ivan Brusic

The discovery API is not modularized enough to use it outside of the
Elasticsearch context. I would simply use something like Zookeeper, which
is built exactly for situations like yours.

Cheers,

Ivan
On Mar 7, 2015 7:03 PM, Pierre de Soyres pdesoy...@gmail.com wrote:

Hello,

I would like to know if I can use the zen-dicovery API of elasticsearch
for a totally different project (using elasticsearch cluster also) but that
also needs to scale horizontaly. I do not want to reinvent the wheel, and
since i'm using elasticsearch library, I would like to take benefit of the
elasticsearch API for nodes detection and master election.

Thanks by advance

*Pierre*

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/659be669-8a25-47bb-8258-08dd1e1e3e59%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/659be669-8a25-47bb-8258-08dd1e1e3e59%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQDiOX9vSPYzcVBCQ5Yb1EKEAKCLf38BXN%2Bq4yboix4Efw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Ignore a field in the scoring

2014-12-26 Thread Ivan Brusic

Use the field in a filter and not part of the query. Is this field free
text?

Ivan
On Dec 23, 2014 9:12 PM, Roger de Cordova Farias
roger.far...@fontec.inf.br wrote:

Hello

Our documents have metadata indexed with them, but we don't want the
metadata to interfere in the scoring

After a user searches for documents, they can bookmark them (what means we
add more metadata to the document), then in the next search with the same
query the bookmarked document appears in a lower (worse) position

Is there a way to completely ignore one or more specific fields in the
scoring of every query? as in indexing time or something?

Note that we are not using the metadata field in the query, but yet it
lowers the score of every query

We cannot set the index attribute of this field to no because we are
gonna use it in other queries

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAJp2533Rjjec4SwXe_p-0eHYkkyEegFyP9DUMGQfHhua8ZyMWQ%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAJp2533Rjjec4SwXe_p-0eHYkkyEegFyP9DUMGQfHhua8ZyMWQ%40mail.gmail.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQAKsYquQJMbfztJ%2Ba2_jpi-fVG%3DvcnXYHS-7bKvaOX4hA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Why ES node starts recovering all the data from other nodes after reboot?

2014-11-24 Thread Ivan Brusic

It used to be 2 concurrent streams. Has the default been upped in recent
versions? I agree, that number is awfully low. If you can disable indexing
during rolling restarts, those numbers can be much higher.

--
Ivan

On Sun, Nov 23, 2014 at 5:48 PM, joergpra...@gmail.com
joergpra...@gmail.com wrote:

The default indices recovery performance is limited by 3 concurrent
streams and 20MB/sec. This is very slow on my machines. YMMV.

Jörg

On Sun, Nov 23, 2014 at 9:01 PM, Konstantin Erman kon...@gmail.com
wrote:

Advice to increase indices.recovery.concurrent_streams sounds
suspiciously specific to me :-) What made you so confident that it is the
bottleneck for recovery in most cases? And how cluster.routing.allocation
.node_concurrent_recoveries should be set?

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGH9BmW_nfZ3wGxOX0gNaJTDR2r0HSPeKc9dWNryUGP-Q%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGH9BmW_nfZ3wGxOX0gNaJTDR2r0HSPeKc9dWNryUGP-Q%40mail.gmail.com?utm_medium=emailutm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQAY-ucXNA6DeKpaVF9eMW%3Ddv95PsEZv8oPs0taojuLpxw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Specifying search fields in search request

2014-11-24 Thread Ivan Brusic

It all depends on how many fields and how big they are. Retrieving a few
specific fields might be faster in cases, but in general, each field is
another seek in Lucene.  Values are not retrieved at the same time. If you
are going to get all the fields, just use the source.

-- 
Ivan

On Sun, Nov 23, 2014 at 10:15 AM, Ajay Divakaran ajay.divakara...@gmail.com
 wrote:

 Jorg
 Thanks for your reply.
 Just to clarify the 'fields' that I was referring was for retrieval not
 against which to search.

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/fe90ecb6-ae17-4234-b633-ae1c1dbfa2a2%40googlegroups.com
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQBh04RxjPTk%2B_jFanKbh%2BhA19TaRWHV%2BL2PZRrFZqZi6A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: terms filter with the value to match in upercase is not possible?

2014-11-23 Thread Ivan Brusic

A term query will not analyze the search terms, so if your countries field
is using the default analyze, there will be no match since the standard
analyzer will lowercase the terms. Either set your field as not_analyzed or
use another query such as match.

--
Ivan

On Sat, Nov 22, 2014 at 4:35 PM, dashaus emili@gmail.com wrote:

Hi!,
I have a document like this:
{
type: film,
countries: [US, ES]
}

And i insert it in elasticsearch, then i do the follow search:

GET _search?search_type=dfs_query_and_fetch
{
query: {
filtered: {
query: {
term: {type: film}
},
filter: {
query: {
terms: {
countries: [
US,
ES
]
}
}
}
}
}
}

But I get a result without the document...But if I change the terms
filter
to...

terms: {
countries: [
us,
es
]
}

Then I get a result with the document! So, this is a bug or is a desired
behavior?

Thanks!

Cheers,
Emilio

--
View this message in context:
http://elasticsearch-users.115913.n3.nabble.com/terms-filter-with-the-value-to-match-in-upercase-is-not-possible-tp4066613.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQBNeyJ%2Bx70Fy6HfEPDpCELGvWhKdJ%2BLKae1S0wpnfxeBg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: 1.4.0 data node can't join existing 1.3.4 cluster

2014-11-22 Thread Ivan Brusic

Great work everyone. Feel better about upgrading now.
On Nov 22, 2014 4:42 PM, Boaz Leskes b.les...@gmail.com wrote:

Hi Christian, Daniel,

I believe I found the issue - it has to do with the cloud plugins (both
AWS and GCE) and the way they create the node list for the unicast based
discovery. Effectively they mislead it to think that that all nodes on the
cluster are version 1.4.0 which is not correct.

I opened issues for this so it will be corrected soon:
https://github.com/elasticsearch/elasticsearch-cloud-aws/issues/143 ,
https://github.com/elasticsearch/elasticsearch-cloud-gce/issues/41

Cheers,
Boaz

On Saturday, November 22, 2014 7:04:33 PM UTC+2, Jörg Prante wrote:

As said, the change is due to unicast action, which was split in 1.4.0 to
an old and a new action, see this commit:

https://github.com/elasticsearch/elasticsearch/commit/
e5de47d928582694c7729d199390086983779e6e
https://www.google.com/url?q=https%3A%2F%2Fgithub.com%2Felasticsearch%2Felasticsearch%2Fcommit%2Fe5de47d928582694c7729d199390086983779e6esa=Dsntz=1usg=AFQjCNFQkgiVz8SfE_dZ5Sa5K7TqYCIQ6g

I am not sure if this is a bug. It seems like a feature to prevent
multiple masters by accident.

The strategy as described above by Christian Hedegaard should work, it is
still to be considered a work-around:

- setting up all new 1.4 nodes as not master eligible (data only)

- joining them to a 1.3.x cluster while master still is on a 1.3 node
should work

- then, shutting down all 1.3 nodes (except the master) should relocate
the shards

- bringing down the final 1.3 master should stall master election (I
would also configure a large timeout for master election). This is
critical, no index/mapping creations/deletions or cluster state modifying
actions should be executed now.

- adding a 1.4 master eligible node should now overtake the cluster (I
would start it with the data folder from the final 1.3 master where the
last cluster state is persisted) and the critical phase is over.

- from then, more 1.4 master eligible nodes should be possible to add

- finally, the minimum master nodes setting should be configured

Jörg

On Fri, Nov 21, 2014 at 1:56 AM, Christian Hedegaard
chedega...@red5studios.com wrote:

FYI, I have found a solution that works (at least for me).

I’ve got a small cluster for testing, only 4 v1.3.5 nodes. What I’ve
done is bring up 4X new v1.4.0 nodes as data-only machines. In the yaml I
added a line to point the nodes via unicast explicitly to the current
master:

discovery.zen.ping.unicast.hosts: [10.210.9.224:9300]

When I restarted elasticsearch with that setting, with cloud-aws
installed and configured on version 2.4.0, the new nodes found the cluster
and properly joined it.

I will now start nuking the old v1.3.5 nodes to migrate the data off of
them. Before the final 1.3.5 node is nuked, I will change the config on one
of the v1.4.0 nodes to allow it as master and restart it.

I’m not sure if the master stuff is needed or not, but I was very afraid
of a split-brain problem. I have another 4-node testing cluster that I will
be able to try this upgrade again with in a more controlled manner.

I’m NOT looking forward to upgrading our current production cluster this
way (15 data-only nodes, 3 master-only nodes).

So it would appear that the problem is somewhere in the unicast
discovery code. The question is who’s to blame? Elasticsearch or the
cloud-aws plugin?

*From:* Boaz Leskes [mailto:b.les...@gmail.com]
*Sent:* Wednesday, November 19, 2014 2:27 PM
*To:* elasticsearch@googlegroups.com
*Cc:* Christian Hedegaard
*Subject:* Re: 1.4.0 data node can't join existing 1.3.4 cluster

Hi Christian,

I'm not sure what thread you refer to exactly, but this shouldn't
happen. Can you describe the problem you have some more? Anything in the
nodes? (both the 1.4 node and the master)

Cheers,

Boaz

On Wednesday, November 19, 2014 2:39:57 AM UTC+1, Christian Hedegaard
wrote:

I found this thread while trying to research the same issue and it looks
like there is currently no resolution. We like to keep up on our
elasticsearch upgrades as often as possible and do rolling upgrades to keep
our clusters up. When testing I’m having the same issue, I cannot add a
1.4.0 box to the existing 1.3.4 cluster.

Is there a fix for this anticipated?

--
You received this message because you are subscribed to the Google
Groups elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/5CF8216AA982AF47A8E6DEACA629D2
2B4EBF409B%40s-us-ex-6.US.R5S.com
https://groups.google.com/d/msgid/elasticsearch/5CF8216AA982AF47A8E6DEACA629D22B4EBF409B%40s-us-ex-6.US.R5S.com?utm_medium=emailutm_source=footer
.

For more options, visit

Re: 1.4.0 data node can't join existing 1.3.4 cluster

2014-11-21 Thread Ivan Brusic

Has an official issue been created? I would like to track the status.

So far, every 1.x.0 release has been buggy. :)

--
Ivan

On Fri, Nov 21, 2014 at 4:06 AM, Mark Walkom markwal...@gmail.com wrote:

It's being looked at, but I don't know much beyond that at the moment
sorry.

On 21 November 2014 20:02, madsmar...@colourbox.com wrote:

Is there any of the elasticsearch team members that can hint to whether
or not this is something that will be fixed in 1.4.1? Then we'll simply
wait for it instead of doing different hacks to upgrade.

On Monday, November 17, 2014 12:35:03 PM UTC+1, Matthew Barrington wrote:

I stand corrected, this did not work on our main cluster.

On Monday, 17 November 2014 11:13:22 UTC, Matthew Barrington wrote:

We are running a 1.3.4 cluster using the AWS plugin and I noticed the
same error when I tried to upgrade a single node.

Since I was trying this on my test cluster first I decided to see what
would happen if I upgraded a 2nd node. Would it split into 2 clusters, have
the same issue, etc.

What I discovered was that when 2 nodes were upgraded to 1.4 they
joined the cluster correctly and everything looks to be working.

SO the problem seems to be for the initial node to join, but when you
try with two everything works out.

On Friday, 14 November 2014 18:05:01 UTC, Eric Jain wrote:

On Fri, Nov 14, 2014 at 3:41 AM, madsm...@colourbox.com wrote:
I'm also seing this problem when a 1.4.0 node tries joining a 1.3.4
cluster
with cloud-aws plugin version 2.4.0. Is there a workaround to use
during
upgrade, since I assume it's not a problem when they're all upgraded
to
1.4.0.

I ended up starting a new cluster (ignoring all the warnings logged on
startup), and restoring from a snapshot. Once all the 1.3.4 nodes were
gone, no issues.

--
Eric Jain
Got data? Get answers at zenobase.com.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/312dcdc1-d826-4cb9-b480-620232634ea7%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/312dcdc1-d826-4cb9-b480-620232634ea7%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAF3ZnZn-ryDDoQps-smzUPkJd5ru9EHfKuAGRReU2-J-C35kvA%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAF3ZnZn-ryDDoQps-smzUPkJd5ru9EHfKuAGRReU2-J-C35kvA%40mail.gmail.com?utm_medium=emailutm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQDZ8k8GQQJn89V4W4S1Bm%3DDKfgMBsaB6a%2B4TcFr67nJkg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Why ES node starts recovering all the data from other nodes after reboot?

2014-11-21 Thread Ivan Brusic

Disabling allocation helps, but it does not solve the problem completely.
Just like Nik, one of my complaints (although not my primary one). :)

I found that recovery gets easier when doing a rolling restart. First few
servers always rebalance, the last ones do not.

--
Ivan

On Thu, Nov 20, 2014 at 9:51 PM, Mark Walkom markwal...@gmail.com wrote:

You should disable allocation before you reboot, that will save a lot of
shard shuffling -
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/setup-upgrade.html#rolling-upgrades

On 21 November 2014 13:48, Konstantin Erman kon...@gmail.com wrote:

I work on an experimental cluster of ES nodes running on Windows Server
machines. Once in a while we have a need to reboot machines. The initial
state - cluster is green and well balanced. One machine is gracefully taken
offline and then after necessary service is performed it comes back online.
All the hardware and file system content is intact. As soon as ES service
starts on that machine, it assumes that there is no usable data locally and
recovers as much data as it deems necessary for balancing from other nodes.

This behavior puzzles me, because most of the data shards stored on that
machine file system can be reused as they are. Cluster stores logs, so all
indices except those for the current day never ever change until they get
deleted. Can't ES node detect that it has perfect copies of some (actually
most) of the shards and instead of copying them over just mark them as up
to date?

I suspect I don't know about some step to enable this behavior and I'm
looking to enable it. Any advice?

Thank you!
Konstantin

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/4fb2d8bc-7787-43e3-8c66-e241945d496b%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/4fb2d8bc-7787-43e3-8c66-e241945d496b%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAF3ZnZmc_rMFzRUUrJSMJ9bY16tz-dZ8eSeUZobC7XaxWZTRPg%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAF3ZnZmc_rMFzRUUrJSMJ9bY16tz-dZ8eSeUZobC7XaxWZTRPg%40mail.gmail.com?utm_medium=emailutm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQDjm%3DvBT7U3%3DQXwZzz83Bf52t21KYZwCuqdYgGJhXKuhQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: ES backups without using snapshots?

2014-11-20 Thread Ivan Brusic

I have never used plugins, but there is also Jorg's tool:
https://github.com/jprante/elasticsearch-knapsack

--
Ivan

On Wed, Nov 19, 2014 at 11:27 PM, Mathew D mathew.degerh...@gmail.com
wrote:

Hi Ivan,

Thanks for the quick response. We've got 5 shards per index, so with 2
replicas each node should in theory have a full set of data. I was hoping
that taking the node out of service by stopping it would avoid disruption
as a result of pausing indexing, but I couldn't find any documentation to
confirm if such an operation would leave the data files in a consistent
state that could reliably be used for restore.

Evan's suggestion of elasticdump looks like the closest to what I'm after,
although unfortunately I don't have node.js/npm installed (and being an
enterprise could be tricky to get installed).

NB I hear your concerns re cluster design. Incorporating the remote node
was chosen to minimise data loss following a data centre failure, however
because of the risk of split brain, the node actually functions more of a
warm DR than any sort of HA...

Regards,
Mat

On Thursday, November 20, 2014 2:32:14 PM UTC+13, Ivan Brusic wrote:

How many shards for each index? I am assuming that each node does not
have all the data.

If you can stop indexing, you can just rsync the data to a local
directory. Make sure you execute a flush and preferably an optimize in
order to merge the segments on disk. The trick part is the manual combine
you referred to.

BTW, 3 nodes/2 data centers? Sounds like a recipe for trouble. :)

Cheers,

Ivan

On Wed, Nov 19, 2014 at 7:41 PM, Mathew D mathew.d...@gmail.com wrote:

Hi there,

Any suggestions as to how I can create full ES backups without using
snapshot functionality?

The reason I can't use snapshots is because they require a shared
directory mounted on all nodes, but my 3-node cluster spans two data
centres and I am not able to NFS mount over the WAN. I'm also not
permitted to backup to AWS/S3.

As I have 2 replicas of each index, I'm leaning towards the idea of
stopping one node and backing up that node's data directory but wondered if
anyone could suggest a more elegant way. For example, could I snapshot to
a local directory on each node, then manually combine the contents into a
single cohesive backup?

Regards,
Mat

--
You received this message because you are subscribed to the Google
Groups elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/f0b8a931-c423-4a37-a6df-5181bd4db309%
40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/f0b8a931-c423-4a37-a6df-5181bd4db309%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/7615a20f-7c90-43e4-b22b-052686cf543b%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/7615a20f-7c90-43e4-b22b-052686cf543b%40googlegroups.com?utm_medium=emailutm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQAfzN%2BbpvL94TbYMHNr0L4x%2BjEA0D6NrM_Hyj8NjUEHmA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Increased query count after moving to nested documents

2014-11-20 Thread Ivan Brusic

We have always indexed nested documents, but never fully used them since
issue 3022 is still outstanding. Finally made the move to actually
filtering documents at the nested level.

Tracking metrics with graphite/grafana, I noticed immediately that the
active/current query count is much higher although the actual volume of
queries has not changed. The overall query count is normal. Is using join
queries increasing the number of queries reported?

Cheers,

Ivan

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQAnkdygA8g2uN0f3DVJyKcGcObrykVFEZZn6uUgVbxXjg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Does nested query with operator honor the operator or does it always display some default behavior

2014-11-19 Thread Ivan Brusic

As mentioned before, that syntax seems strange to me. I have never seen an
array used with a match query. I wonder what the resulting Lucene query is.
I think that analyzed/non-analyzed just might be a red herring. What does
the explanation output say?

--
Ivan

On Wed, Nov 19, 2014 at 10:24 AM, Ramdev Wudali agasty...@gmail.com wrote:

The fields (I am searching against) are analyzed, by the default analyzer.
The query as you I noted in my question was generated by using the JAVA
API, So the array syntax is generated by the API's interpretation. That
said, I ran a few more experiments. If the field were not analyzed (unlike
my non experiment case), The query function works and returns the right
documents. (meaning where both the values exist) in the returned
documents. But if they are analyzed, the operator is not honored.

So now my question is, why would not analyzed fields cause the operator
to be honored ?
and Does the operator field within a nested query depend on if the field
in the nested field is actually analyzed or not. ?

Ramdev

On Tuesday, 18 November 2014 14:45:53 UTC-6, Ivan Brusic wrote:

I have never seen the array syntax with the match query, so I am not sure
what the behavior should be. Since your search terms are not analyzed in
your example, a terms query with a minimum match of 100% should work. If
not, perhaps creating a single search term of your existing terms?

--
Ivan

On Tue, Nov 18, 2014 at 10:23 AM, Ramdev Wudali agas...@gmail.com
wrote:

Hi :
I have the following query :
{
query: {
bool: {
must: {
nested: {
query: {
bool: {
must: [
{
match: {
NESTED_FIELD.v: {
query: [ AAPL.OQ, GOOGL.OQ],
operator: and

}
}
},
{
range: {
NESTED_FIELD.s: {
from: 0.6,
to: null,
include_lower: true,
include_upper: true
}
}
}
]
}
},
path: NESTED_FIELD
}
}
}
},
filter: {
bool: {
must: [
{
range: {
DOC_DATE.v: {
from: 2014-08-19T20:00:00.000-04:00,
to: 2014-10-18T23:59:59.999Z,
include_lower: true,
include_upper: true
}
}
}
]
}
}
}

The behavior I expect is the following :

In the documents that are returned, they should contain both values for
the NESTED_FIELD.v (AAPL.OQ and GOOG.OQ) that satisfy the condition where
their corresponding NESTED_FIELD.v range also is satisfied.

The behavior I see :
the documents returned contain either one of the values (as in its
got AAPL.OQ (OR) GOOG.OQ (OR) Both.

I want documents that only have both the values. So the operator
:and (and its variant operator:AND) does not seem to have any effect.
any pointers suggestions regarding this is much appreciated. I am using
the JAVA API to construct my queries. (but I do not think it should matter)

Thanks

Ramdev

--
You received this message because you are subscribed to the Google
Groups elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/c07cc19b-9ff1-4074-b79c-1861afb7e866%
40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/c07cc19b-9ff1-4074-b79c-1861afb7e866%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/20a3c1d1-e2cc-44a5-affe-5b30777bcc8d%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/20a3c1d1-e2cc-44a5-affe-5b30777bcc8d%40googlegroups.com?utm_medium=emailutm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQANjUjEjCr%3D0jwf-UB5mEVmFUF8YsJUgVbpLQcEhFpgkQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Please help me to do a filtered query!!!

2014-11-19 Thread Ivan Brusic

Try using Jorg's plugin:
https://github.com/jprante/elasticsearch-plugin-arrayformat

-- 
Ivan

On Wed, Nov 19, 2014 at 7:15 AM, tch...@360incentives.com wrote:



 On Wednesday, 19 November 2014 13:07:13 UTC+2, tch...@360incentives.com
 wrote:

 Hi Everyone! please advice me how return only specific value from a query:

 Here's how my index looks like:

  {
 _index : logstash-2014.10.23,
 _type : mia_worker-prod,
 _id : wLAyxPVRTeOXsl9mUk3iNw,
 _score : 1,
 _source : {
 name : mia_worker,
 hostname : 360prod-mia01,
 pid : 890,
 resource : user_timeline,
 id : claim:17762896,
 version : 2014-10-23T13:34:08.817,
 level : 30,
 msg : changed.complete,
 time : 2014-10-23T13:35:07.714Z,
 v : 0
 }

 I need a query that just returns a level:30 value for my monitoring!
 Thank you in advance for your help!

 Please Please help!!!

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/60a3cb44-51d0-47e7-ab6e-62d731c49712%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/60a3cb44-51d0-47e7-ab6e-62d731c49712%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQBP85Oqk7ozZJ2LG%3DGLqLa5UwpCB0igUV5%2BfRtBTifPKQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Does nested query with operator honor the operator or does it always display some default behavior

2014-11-18 Thread Ivan Brusic

I have never seen the array syntax with the match query, so I am not sure
what the behavior should be. Since your search terms are not analyzed in
your example, a terms query with a minimum match of 100% should work. If
not, perhaps creating a single search term of your existing terms?

-- 
Ivan

On Tue, Nov 18, 2014 at 10:23 AM, Ramdev Wudali agasty...@gmail.com wrote:

 Hi :
I have the following query :
 {
   query: {
 bool: {
   must: {
 nested: {
   query: {
 bool: {
   must: [
 {
   match: {
 NESTED_FIELD.v: {
   query: [ AAPL.OQ, GOOGL.OQ],
   operator: and

 }
   }
 },
 {
   range: {
 NESTED_FIELD.s: {
   from: 0.6,
   to: null,
   include_lower: true,
   include_upper: true
 }
   }
 }
   ]
 }
   },
   path: NESTED_FIELD
 }
   }
 }
   },
   filter: {
 bool: {
   must: [
 {
   range: {
 DOC_DATE.v: {
   from: 2014-08-19T20:00:00.000-04:00,
   to: 2014-10-18T23:59:59.999Z,
   include_lower: true,
   include_upper: true
 }
   }
 }
   ]
 }
   }
 }

 The behavior I expect is the following :

  In the documents that are returned, they should contain both values for
 the NESTED_FIELD.v (AAPL.OQ and GOOG.OQ) that satisfy the condition where
 their corresponding NESTED_FIELD.v range also is satisfied.

 The behavior I see :
the documents returned contain either one of the values (as in its got
 AAPL.OQ (OR) GOOG.OQ (OR) Both.

 I want documents that only have both the values. So the operator :and
 (and its variant operator:AND) does not seem to have any effect.
 any pointers suggestions regarding this is much appreciated. I am using
 the JAVA API to construct my queries. (but I do not think it should matter)

 Thanks

 Ramdev


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/c07cc19b-9ff1-4074-b79c-1861afb7e866%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/c07cc19b-9ff1-4074-b79c-1861afb7e866%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQDwgN1Ja2EBjyp2_eZs29h6cb1iZ55CCU0DRjXiEzY5%3Dg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: EsStorage - Set field as not_analyzed

2014-11-18 Thread Ivan Brusic

You can set up an index template without creating the index. The template
will only be read when the index is created.

--
Ivan

On Tue, Nov 18, 2014 at 11:10 AM, Kingsley Elmes kingsleyel...@gmail.com
wrote:

Hi,

Does anyone know is it possible to set a field as 'not_analyzed' via the
constructor parameters when using the Pig EsStorage load function?

Or do I have to create create the index and set the mappings prior to
loading the data using EsStorage.

Thanks,
Kingsley

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/04207f72-873c-4bd3-a812-72cc7da3d979%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/04207f72-873c-4bd3-a812-72cc7da3d979%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQD-rH1x_Bn52hbUHD1GydP88sVzwBGs5TTrvnJyEYL1uQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: 1.4.0 data node can't join existing 1.3.4 cluster

2014-11-13 Thread Ivan Brusic

Rolling upgrades should be supported:

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/setup-upgrade.html#rolling-upgrades

How else can you perform a rolling upgrade without having a mixed cluster?

-- 
Ivan

On Thu, Nov 13, 2014 at 1:05 PM, joergpra...@gmail.com 
joergpra...@gmail.com wrote:

 Do not mix 1.3 with 1.4 nodes, it does not work.

 Jörg

 On Thu, Nov 13, 2014 at 5:16 PM, Todd Kamin t...@crowdtangle.com wrote:

 I'm seeing something very similar, also from 1.3.4 to 1.4.0, also using
 the elasticsearch-cloud-aws plugin 2.4.

 [2014-11-13 16:02:26,055][WARN ][discovery.zen.ping.unicast] [White Fang]
 failed to send ping to
 [[#cloud-i-03f79bcb-0][localhost][inet[/10.0.0.76:9300]]]
 org.elasticsearch.transport.RemoteTransportException: [Kraven the
 Hunter][inet[/10.0.0.76:9300]][internal:discovery/zen/unicast]
 Caused by: org.elasticsearch.transport.ActionNotFoundTransportException:
 No handler for action [internal:discovery/zen/unicast]
 at
 org.elasticsearch.transport.netty.MessageChannelHandler.handleRequest(MessageChannelHandler.java:210)
 at
 org.elasticsearch.transport.netty.MessageChannelHandler.messageReceived(MessageChannelHandler.java:111)

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/840f26f7-c401-4229-b94d-092ee2ef4974%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/840f26f7-c401-4229-b94d-092ee2ef4974%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGOOeMcuZqw7xADCuTc-MTQkuqQ0QtnBR%2Bg9TWBS6DOng%40mail.gmail.com
 https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGOOeMcuZqw7xADCuTc-MTQkuqQ0QtnBR%2Bg9TWBS6DOng%40mail.gmail.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQDrfhTAc6FhPpU-ZcHtW4Qp_9qB9G6z%2BSXUR7fqtB52_A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Aggregating on nested fields

2014-11-12 Thread Ivan Brusic

I beg to differ, aggregations work with the root documents returned by
query, so they do not work under a global context. :) I guess under my
proposed vision the issue would be then how to have aggregations on
documents returned with a nested filter, but still maintain all the nested
documents. A pseudo global nested context. The sounds more painful than
returning all the nested documents.

Cheers,

Ivan

On Tue, Nov 11, 2014 at 5:30 PM, Adrien Grand
adrien.gr...@elasticsearch.com wrote:

Hi Ivan,

You indeed ned to repeat the filter under a nested aggregation to make it
work. If we ever allow queries to return nested documents, I agree that
filters should not be repeated under aggs, but since now queries only
return the root documents, I think it is actually consistent to return all
nested docs under a nested aggregation, and not only those that matched a
(potential) nested query. I also like the fact that it allows aggregations
to not know about the query.

On Tue, Nov 11, 2014 at 5:27 PM, Ivan Brusic i...@brusic.com wrote:

I suddenly remembered when using facets that I had to apply the same
query filter as a facet filter with the join option disabled. Turns out it
is somewhat identical with aggregations. My problem was that the scope of
my nested aggregation with not under the scope of the filter aggregation. I
hope #3022 and related issues can bring about less ambiguous aggregations.
Nested aggregations on pre-filtered nested documents should work as is. If
not, the global scope aggregation should be used.

Ivan

On Mon, Nov 10, 2014 at 3:43 PM, Ivan Brusic i...@brusic.com wrote:

Reproducible gist: https://gist.github.com/brusic/81e1552ffd49a1f6a7aa

Surely I cannot be the only one to have encountered this issue.

--
Ivan

On Mon, Nov 10, 2014 at 12:53 PM, Ivan Brusic i...@brusic.com wrote:

Is it possible to aggregate only on the nested documents that are
returned by a (filtered) query? For what I can tell when using a nested
aggregation, it will function on all nested documents of the parent
documents whose nested document satisfy a nested query/filter. Did that
make sense? :) Is this the same limitation as issue #3022? I know that
number by heart by now.

For example, I have 3 simple documents, where the nstd object is
defined as nested:

{
name : foo,
nstd : [
{
ID : 1
}
]
}
'

{
name : bar,
nstd : [
{
ID : 2
}
]
}
'

{
name : baz,
nstd : [
{
ID : 1
},
{
ID : 2
}
]
}
'

I then execute a simple nested query:

query: {
filtered: {
query: {
match_all: {}
},
filter: {
nested: {
path: nstd,
filter: {
term: {
nstd.ID: 1
}
}
}
}
}
}

If I aggregate on the nstd.ID field, I will always get back results for
nested documents that were excluded by the filter:

buckets: [
{
key: 1,
doc_count: 2
},
{
key: 2,
doc_count: 1
}
]

Since the ID:2 field does not match the filter, it should not be
returned with the aggregation. I have tried using a filter aggregation with
the same filter used in the filtered query, but I receive the same results.

Cheers,

Ivan

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQCqV%2BVtExWQ%2B15V6ywZ6pMog91ajB%3DEyN81ueditLGZKg%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQCqV%2BVtExWQ%2B15V6ywZ6pMog91ajB%3DEyN81ueditLGZKg%40mail.gmail.com?utm_medium=emailutm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
Adrien Grand

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j5ANh6%2B8ieH9C6YeODm_0JPmMvdxfTa7WQBYVYnqtfpjA%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j5ANh6%2B8ieH9C6YeODm_0JPmMvdxfTa7WQBYVYnqtfpjA%40mail.gmail.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch

Re: Exact name match search - can't get it to work

2014-11-12 Thread Ivan Brusic

The terms are copied to the full name and are not analyzed as specified.
However, two terms are being copied, not one. The term query expects a
single token of Jeremy Smith, while you have two separate non analyzed
tokens.

Cheers,

Ivan
On Nov 12, 2014 10:29 AM, Robert Alkire dracomor...@gmail.com wrote:

https://gist.github.com/anonymous/b59ea5a6bbf308f8e562

This is the definition of the problem.
It seems that index : not_analyzed is broken when using copy_to fields

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/ddd7c884-788d-4f92-a338-d193f5ed2948%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/ddd7c884-788d-4f92-a338-d193f5ed2948%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQDd%2BDiYy3zrQsyjZNQn09FY-E0o-np7FAtWnaTJ%2BPXcJw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Custom cluster action

2014-11-12 Thread Ivan Brusic

There is also an ActionModule

public void onModule(ActionModule module) {
module.registerAction(MyAction.INSTANCE, TransportMyAction.class);
}

It is always easier to follow existing plugins.

Cheers,

Ivan

On Wed, Nov 12, 2014 at 3:50 PM, Pawel pro...@gmail.com wrote:

 Hi,
 I'm thinking about building custom ClusterAction. I see that I can build
 custom classes for Request, NodeResponse and NodesRespone but it is not
 clear to me how I can register my custom action.

 In case of Rest action it was quite easy because in plugin i simply use

 public void onModule(RestModule module) {
 module.addRestAction(RestCustomAction.class);
 }

 but I cannot find any examples how I can do this in case of custom
 ClusterAction.

 Does anybody have any example how I can achieve this? Thanks for your help.

 --
 Paweł Róg

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAF9ZkbPoNd4yZLdow_sMJmpN8dr0krEipTX9SbMOp%2BugM0L8_w%40mail.gmail.com
 https://groups.google.com/d/msgid/elasticsearch/CAF9ZkbPoNd4yZLdow_sMJmpN8dr0krEipTX9SbMOp%2BugM0L8_w%40mail.gmail.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQCTD3ZTreOTPXzd1ifwTqyFj3rzRd8KOsZxs8y6qg-M3Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Netflix releases Elasticsearch automation tool

2014-11-12 Thread Ivan Brusic

Interesting in that they use Cassandra for discovery.

http://techblog.netflix.com/2014/11/introducing-raigad-elasticsearch-sidecar.html

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQA7SA-531xARfVZYF6wLOLmBguLdYmxyS1p90Mu16hAsg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Aggregating on nested fields

2014-11-11 Thread Ivan Brusic

I suddenly remembered when using facets that I had to apply the same query
filter as a facet filter with the join option disabled. Turns out it is
somewhat identical with aggregations. My problem was that the scope of my
nested aggregation with not under the scope of the filter aggregation. I
hope #3022 and related issues can bring about less ambiguous aggregations.
Nested aggregations on pre-filtered nested documents should work as is. If
not, the global scope aggregation should be used.

-- 

Ivan


On Mon, Nov 10, 2014 at 3:43 PM, Ivan Brusic i...@brusic.com wrote:

 Reproducible gist: https://gist.github.com/brusic/81e1552ffd49a1f6a7aa

 Surely I cannot be the only one to have encountered this issue.

 --
 Ivan

 On Mon, Nov 10, 2014 at 12:53 PM, Ivan Brusic i...@brusic.com wrote:

 Is it possible to aggregate only on the nested documents that are
 returned by a (filtered) query? For what I can tell when using a nested
 aggregation, it will function on all nested documents of the parent
 documents whose nested document satisfy a nested query/filter. Did that
 make sense? :) Is this the same limitation as issue #3022? I know that
 number by heart by now.

 For example, I have 3 simple documents, where the nstd object is defined
 as nested:

 {
   name : foo,
   nstd : [
 {
 ID : 1
 }
   ]
 }
 '

 {
   name : bar,
   nstd : [
 {
 ID : 2
 }
   ]
 }
 '

 {
   name : baz,
   nstd : [
 {
 ID : 1
 },
 {
 ID : 2
 }
   ]
 }
 '

 I then execute a simple nested query:

query: {
   filtered: {
  query: {
 match_all: {}
  },
  filter: {
 nested: {
path: nstd,
filter: {
   term: {
  nstd.ID: 1
   }
}
 }
  }
   }
}

 If I aggregate on the nstd.ID field, I will always get back results for
 nested documents that were excluded by the filter:

 buckets: [
{
   key: 1,
   doc_count: 2
},
{
   key: 2,
   doc_count: 1
}
 ]

 Since the ID:2 field does not match the filter, it should not be returned
 with the aggregation. I have tried using a filter aggregation with the same
 filter used in the filtered query, but I receive the same results.

 Cheers,

 Ivan




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQCqV%2BVtExWQ%2B15V6ywZ6pMog91ajB%3DEyN81ueditLGZKg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Integrated authentication

2014-11-11 Thread Ivan Brusic

Quite the opposite: the Elasticsearch team and others have said that
authentication belongs outside of the application. Or at least, security
was not a high priority.

It seems like they are working on security and a release should be
forthcoming:
http://www.elasticsearch.org/blog/shield-know-security-coming-soon/

Question remains if this will be a paid product like Marvel or not.

--
Ivan

On Tue, Nov 11, 2014 at 11:40 AM, Simon Thorley si...@thenom.co.uk wrote:

Hi all,

I was under the impression that as of Elasticsearch V1.0 that
authenticated access was going to be integrated without having to use a
third-party proxy (i.e. nginx)?

Elasticsearch is normally quite well documented and i can't seem to find
anything regards setting this up.

Probably just me not looking hard enough or i could be totally wrong.

Any help appreciated.

Cheers,
Simon

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/f828218f-31c6-495c-8e3d-6c1e925b53f5%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/f828218f-31c6-495c-8e3d-6c1e925b53f5%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQAYO-1wvNJ1PcHJkm25UgBbOX_deBsed5Wz47O7aT2WkA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Filter cache - based on full set or result of previous filters?

2014-11-11 Thread Ivan Brusic

The status filter cache will indeed contain all entries. And technically,
the cache is per segment, and not across all documents, but this should be
transparent.

Caching is enabled by default for the term filters, but disabled for the
bool filter. You can enable it if you think users will be reusing the
filter.

--
Ivan

On Tue, Nov 11, 2014 at 3:23 AM, Lasse Schou lassesc...@gmail.com wrote:

Hi,

I have a search request that uses a couple of filters. I'm using
bool+must, and I'm trying to optimize the request as much as possible.

- Some filters are used by all users of my platform, but aren't very
selective.
- Some filters are very specific to individual users, and are highly
selective.

I've read that I should use the most selective filters first, to ease the
work performed by the subsequent filters.

However one thing that's not 100% clear is how the filter cache bitmaps
works. Do they store the result of a filter if performed across the entire
dataset, or does it store the filtered result of the previous filter's
output?

Example. Querying the paid invoices of an account:

{ query:
{ filtered:
{ filter:
{ bool:
{ must: [
{ term: { status: paid } }, (all users use this, but
it's not very selective)
{ term: { account: 123456 } }
]}
}
}
}
}

Following the advice of using the most highly selective filter first, I
should place the account filter first. On the other hand I want to be
sure that all users will re-use the cached output of the status filter.

Question: will the status filter cache contain *all* paid invoices of
all accounts, no matter in which order I use the filters?

The above code is just an example - I'm trying to optimize the code for a
dataset for 1B+ documents, so please take this into consideration.

Thanks,
Lasse

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/7ea47711-38c1-4bc7-bc7c-41d85fb5cf81%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/7ea47711-38c1-4bc7-bc7c-41d85fb5cf81%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQBXWb82GwrBgAyHKbGXbwtRJ8JaVZhEYB72EnTm%2Brp1qw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: accidently ran another instance of elasticsearch on a few nodes

2014-11-10 Thread Ivan Brusic

To avoid this situation in the future, besides using a service to start
elasticsearch, you can enforce the max nodes setting:

node.max_local_storage_nodes: 1

--
Ivan

On Sun, Nov 9, 2014 at 5:24 PM, Mark Walkom markwal...@gmail.com wrote:

Yellow means unassigned replicas, try removing them and then adding them
back.

Once your cluster is green you can stop one of the nodes with the extra
data and then delete the extra directory, just make sure you let the other
nodes rebalance and your cluster is green again before deleting, otherwise
you may risk losing data.

On 8 November 2014 19:12, Johan Öhr johan@gmail.com wrote:

Hi, while trying to set up an another process as master, i believe i for
some time ran multiple instances of elasticsearch on three nodes.

On these nodes, its looks like this:
/var/lib/elasticsearch/elasticsearch/indices/0
/var/lib/elasticsearch/elasticsearch/indices/1

On my other nodes, that are fine it looks like:
/var/lib/elasticsearch/elasticsearch/indices/0

So, there is alot of data in the 1-directory on three nodes, and these
shards will not be ASSIGNED, my cluster stays yellow.

This mistake happend 1 week ago, since then i have restarted ES a couple
of times, but it was just now that i got the problem.

How can i fix this?

At the moment im running 5 nodes, where 3 runs another instance of
elasticsearch, as just masters

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/3c852fff-611a-45fb-b1c2-e5962d733977%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/3c852fff-611a-45fb-b1c2-e5962d733977%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAF3ZnZnUn1hHAy60TTiOxo8b5rdt0N35dDYjq7Abb3M8pNNEjg%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAF3ZnZnUn1hHAy60TTiOxo8b5rdt0N35dDYjq7Abb3M8pNNEjg%40mail.gmail.com?utm_medium=emailutm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQBUueXaZ64g5%2B%2B3R69iFxJWo3oHQXq3cuhKt4CfYB33Dg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Looking for a sexy solution for Aggregations

2014-11-10 Thread Ivan Brusic

The only solution that I can think of is to execute your query with a post
filter, not a filtered query. In this way, your aggregations by default
will not be filtered. You can then have two histograms, one with the post
filter used as an aggregation filter, and the other one left alone.

-- 
Ivan

On Fri, Nov 7, 2014 at 11:04 AM, kazoompa rha...@p3g.org wrote:

 Hi,

 Consider the aggregation below:

 Sociodemographic_economic_characteristics: {
   terms: {
 field: Sociodemographic_economic_characteristics,
 size: 0,
 min_doc_count: 0,
 order: {
   _term: asc
 }
   }
 }

 This is the result without any filters:

 Enter code here...

- Sociodemographic_economic_characteristics: {
   - buckets: [
  - {
 - key: Age
 - doc_count: 93
  }
  - {
 - key: Education
 - doc_count: 42
  }
  - {
 - key: Ethnic_race_religion
 - doc_count: 17
  }
  - {
 - key: Family_hh_struct
 - doc_count: 55
  }
  - {
 - key: Income
 - doc_count: 10
  }
  - {
 - key: Labour_retirement
 - doc_count: 150
  }
  - {
 - key: Marital_status
 - doc_count: 20
  }
  - {
 - key: Residence
 - doc_count: 20
  }
  - {
 - key: Sex
 - doc_count: 7
  }
   ]
}


 This is the result with the filter:


- Sociodemographic_economic_characteristics: {
   - buckets: [
  - {
 - key: Age
 - doc_count: 0
  }
  - {
 - key: Education
 - doc_count: 0
  }
  - {
 - key: Ethnic_race_religion
 - doc_count: 0
  }
  - {
 - key: Family_hh_struct
 - doc_count: 0
  }
  - {
 - key: Income
 - doc_count: 0
  }
  - {
 - key: Labour_retirement
 - doc_count: 150
  }
  - {
 - key: Marital_status
 - doc_count: 0
  }
  - {
 - key: Residence
 - doc_count: 0
  }
  - {
 - key: Sex
 - doc_count: 0
  }
   ]
}


 I would like to find a way to have the two combined ine one query search
 such that the client can show the info in this manner:

 Age: 0/93
 Education: 0/42
 Ethnic_race_religion: 0/17
 Family_hh_struct: 0/55
 Income: 0/10
 Labour_retirement:150/150
 ...


 As alternatives, I considered doing a Multiple Search or two independent
 search queries, but is there any way to do this in one go using the
 Elasticsearch goodies (nested aggs, etc)

 Thanks,
 Ramin




  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/350c0c57-4f4d-41a0-ab37-e5075b2ddccb%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/350c0c57-4f4d-41a0-ab37-e5075b2ddccb%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQBKAF_COLnA8wR4m-Hnc4AMH%2BojkGy6sVan-CPU8PwnVQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: min_score doesn't seem to be working with _count api

2014-11-10 Thread Ivan Brusic

Just a guess, but I would assume that the count API does not score
documents, which is why it is faster, leading to a setting such as
min_score to be obsolete.

-- 
Ivan

On Mon, Nov 10, 2014 at 10:55 AM, Roly Vicaria roly...@gmail.com wrote:

 Also, I'm trying this on v1.3.2


 On Monday, November 10, 2014 10:54:33 AM UTC-5, Roly Vicaria wrote:

 Hello,

 I'm trying to pass a min_score parameter using the _count api, but it
 doesn't seem to work. When I add it via query string, it just gets ignored
 and returns the full count. When I add it to the query using min_score :
 1, then I get the following exception response:

 {count:0,_shards:{total:5,successful:0,failed:5,
 failures:[{index:recruitment,shard:1,reason:
 BroadcastShardOperationFailedException[[recruitment][1] ]; nested:
 QueryParsingException[[recruitment] request does not support
 [min_score]]; },{index:recruitment,shard:0,reason:
 BroadcastShardOperationFailedException[[recruitment][0] ]; nested:
 QueryParsingException[[recruitment] request does not support
 [min_score]]; },{index:recruitment,shard:3,reason:
 BroadcastShardOperationFailedException[[recruitment][3] ]; nested:
 QueryParsingException[[recruitment] request does not support
 [min_score]]; },{index:recruitment,shard:2,reason:
 BroadcastShardOperationFailedException[[recruitment][2] ]; nested:
 QueryParsingException[[recruitment] request does not support
 [min_score]]; },{index:recruitment,shard:4,reason:
 BroadcastShardOperationFailedException[[recruitment][4] ]; nested:
 QueryParsingException[[recruitment] request does not support
 [min_score]]; }]}}

 Has anyone else run into this?

 Thanks,
 Roly

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/b8d2dbfe-6b07-404e-887a-11ef3b46426a%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/b8d2dbfe-6b07-404e-887a-11ef3b46426a%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQBd3TuBk0CwkBT%2Bq5KzSqEpHTJwe5GU1F8iaMGU5iFJLw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: does there exists an exists query

2014-11-10 Thread Ivan Brusic

Off the top of my head, the easiest option would be to use a constant score
query. Wrap the original query and provide a boost to documents that
satisfy your exist filter.

Cheers,

Ivan

On Mon, Nov 10, 2014 at 12:54 PM, Volker s...@klest.de wrote:

I would like to know whether there is an exists query in ES.

I know that there is an exists filter, but I would like to have an exists
query. So documents, where a field exists, should be rated higher than the
ones, where the field does not exists. But if there is no document in that
query, where the field exists, it should still return the other documents.
This would not work with the exists filter, as far as I know.

I know, that I could index an additional field, with the value (e.g.)
true, when the field exists. But I would rather not have this additional
data in the index.

So, what is the best solution for this use case?

thanks in advance!

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/e82e64a1-53d7-4067-9a91-ce77fda8c85f%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/e82e64a1-53d7-4067-9a91-ce77fda8c85f%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQDjKdKh7GquS%3DvND0vTbfmYGwF4bqLy16MSXCnCeJR2-w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Aggregating on nested fields

2014-11-10 Thread Ivan Brusic

Reproducible gist: https://gist.github.com/brusic/81e1552ffd49a1f6a7aa

Surely I cannot be the only one to have encountered this issue.

-- 
Ivan

On Mon, Nov 10, 2014 at 12:53 PM, Ivan Brusic i...@brusic.com wrote:

 Is it possible to aggregate only on the nested documents that are returned
 by a (filtered) query? For what I can tell when using a nested aggregation,
 it will function on all nested documents of the parent documents whose
 nested document satisfy a nested query/filter. Did that make sense? :) Is
 this the same limitation as issue #3022? I know that number by heart by now.

 For example, I have 3 simple documents, where the nstd object is defined
 as nested:

 {
   name : foo,
   nstd : [
 {
 ID : 1
 }
   ]
 }
 '

 {
   name : bar,
   nstd : [
 {
 ID : 2
 }
   ]
 }
 '

 {
   name : baz,
   nstd : [
 {
 ID : 1
 },
 {
 ID : 2
 }
   ]
 }
 '

 I then execute a simple nested query:

query: {
   filtered: {
  query: {
 match_all: {}
  },
  filter: {
 nested: {
path: nstd,
filter: {
   term: {
  nstd.ID: 1
   }
}
 }
  }
   }
}

 If I aggregate on the nstd.ID field, I will always get back results for
 nested documents that were excluded by the filter:

 buckets: [
{
   key: 1,
   doc_count: 2
},
{
   key: 2,
   doc_count: 1
}
 ]

 Since the ID:2 field does not match the filter, it should not be returned
 with the aggregation. I have tried using a filter aggregation with the same
 filter used in the filtered query, but I receive the same results.

 Cheers,

 Ivan


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQBUa1XbQRoAtgoj71kjrgHOW5PS%3Dr08U2ZDie9HXKs_2A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: SpanNotQuery issues

2014-11-07 Thread Ivan Brusic

Yes, that post explained it a lot better than I wanted to. :) But basically
yes, the exclude portion is only as part of an existing span, but a single
span term is not really a span.

Ultimately, span queries are not very flexible since they do not analyze
terms, which is why I suspect there are rarely any questions about them
(and perhaps why my PR is in limbo). Phrase matches might work better, but
they do not support in order slop.

--
Ivan

On Thu, Nov 6, 2014 at 2:56 PM, Jade Tremblay jadetremblay@gmail.com
wrote:

@Ivan

I've pick you git push and integrated it into ElasticSearch source
tag:v1.3.5
After a rebuild, it seems to work perfectly (I am still trying to find the
maximum values for pre and post, no luck so far).

I've been able to figure out how span_not works with this post.
I add it here if someone else is looking to understand why the actual gist
return 2 results.

http://stackoverflow.com/questions/24260103/spannotquery-giving-unexpected-results-exclude-is-ignored

Thanks for the gist and the push request!

Cheers,
Jade

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/776c24aa-0a11-46ce-8c81-5f78e4c187dc%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/776c24aa-0a11-46ce-8c81-5f78e4c187dc%40googlegroups.com?utm_medium=emailutm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQAipZAp7fTBTp8PRy_RBORkmTTZJdDipqqeiB3HnUHu3g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Distributed Frequency Search

2014-11-06 Thread Ivan Brusic

We did some performance testing and found that the performance hit from
using DFS was minor.

--
Ivan

On Wed, Nov 5, 2014 at 8:55 AM, Sofiane Cherchalli sofian...@gmail.com
wrote:

Answering myself:

According to ES blog
http://www.elasticsearch.org/blog/understanding-query-then-fetch-vs-dfs-query-then-fetch/
there is performance hit. It would be nice to have a feature that triggers
automatically DFS based on a kinda threshold...

On Wednesday, November 5, 2014 2:44:14 PM UTC+1, Sofiane Cherchalli wrote:

According to http://www.elasticsearch.org/guide/en/elasticsearch/guide/
current/relevance-is-broken.html, the relevance is broken until we have
enough data distributed uniformly across shards.

My question is: If I initially use the ?search_type=dfs_query_then_fetch
parameter because I few data, will it affect the performance when the
Production environment will have enough data sharded uniformly?

Thanks.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/2d535fce-8df3-4e13-9259-b017a11ac634%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/2d535fce-8df3-4e13-9259-b017a11ac634%40googlegroups.com?utm_medium=emailutm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQCLCajraVb2VkY%3DsTLofJhYp1OCSz2QtkA603JJfrHgHw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: SpanNotQuery issues

2014-11-06 Thread Ivan Brusic

Pretty old indeed. As explained briefly, I was migrating a Lucene system to
Elasticsearch and did not understand why the span not queries were not
working, only to discover we had a custom parser, to support syntaxes such
as the one you are expecting.

Span nots are tricky in Lucene, but basically you are looking for when
dog is not a near quick near a dog. Sounds confusing because it is.

I noticed you +1 the pull request I submitted which will probably help your
use case. The code should work as is without a need to merge in the current
codebase, but I guess it would help. Stuck on a laptop with Java 6, so I
can no longer build Elasticsearch. Will try to find the time.

Cheers,

Ivan

On Thu, Nov 6, 2014 at 10:12 AM, Jade Tremblay jadetremblay@gmail.com
wrote:

Hello Ivan,

I know this post is pretty old.
I am definitely puzzled with the gist that you provided.
Why is there 2 matches?
exclude : {
span_term : {
field1 : dog
}
}
I though we should exclude match with dog...
Could you please point me to proper information to understand what is
happening?

Thx,
Jade

Le mercredi 22 août 2012 02:01:09 UTC-4, Ivan Brusic a écrit :

Slight error on my part. Without even realizing it, I was using a
custom query parser in Lucene that handled SpanNotQueries differently.
The queries work as expected in ElasticSearch, true to the Lucene
standard.

--
Ivan

On Tue, Aug 21, 2012 at 1:58 PM, Ivan Brusic iv...@brusic.com wrote:
Reproducible issue: https://gist.github.com/d3f9f82ec6c95c11585e

On Tue, Aug 21, 2012 at 11:59 AM, Ivan Brusic iv...@brusic.com
wrote:
Is anyone using SpanNotQuerys? Judging by a recent issue that was
never uncovered until now, I am assuming not:
https://github.com/elasticsearch/elasticsearch/issues/2192

The exclude portion of my SpanNotQuerys are having no effect on the
query.

Given a document { title: Handheld Apple iPad 2 Wi Fi tablet iOS 5
64 GB 9.7 black Buy.com }

These two queries return the same results:

SpanNotQuery
{
query : {
span_not : {
include : {
span_term : {
title : ipad
}
},
exclude : {
span_term : {
title : black
}
}
}
}
}

SpanTermQuery
{
query : {
span_term : {
TitleString_en : {
value : ipad,
boost : 1.0
}
}
}
}

Explain between the two queries is identical:
9.180814 = fieldWeight(title:spanNot(ipad, black) in 531555), product
of:
0.70710677 = tf(phraseFreq=0.5)
7.4192176 = idf(title: ipad=9370)
1.75 = fieldNorm(field=title, doc=531555)

SpanNotQueries in Lucene are working perfectly: spanNot(title:ipad,
title:black). I haven't traced through the code in ElasticSearch, but
the code seems to be creating the correct Lucene class. Anyone
successfully using SpanNotQuerys?

Cheers,

Ivan

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/0b81d1aa-35e4-468d-8e66-6a8b50029e6c%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/0b81d1aa-35e4-468d-8e66-6a8b50029e6c%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQCbVdBcZxVeDQLSUaPOGScaZMit6soFyteSymaT3y3-1g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: ElasticSearch enable Snowball Analyzer and Synonym on Fields

2014-11-06 Thread Ivan Brusic

You would need to create a custom analyzer by basically repeating the
configuration of the snowball analyzer, but adding in the synonym filter.
You can't modify a stock analyzer, unless this has changed (if so, someone
please correct me).

--
Ivan

On Wed, Nov 5, 2014 at 6:43 PM, Iqbal Ahmed iq.q...@gmail.com wrote:

Hi guys,

Originally I posted this in SO but found this place which seems more
suitable to ask :)

I have an elasticsearch index where my default analyzer is the snowball
analyzer so I can get the stemming and now I need the ability to have
synonyms on some of the fields as well as the benefit of the snowball
analyzer to do stemming.

Is this possible and if so how?

As a test I tried to set a synonym filter on the snowball analiyzer,
hoping that it would enable synonyms on all fields so I can test it but
that didn't really work...

I created another analyzer for synonyms on my index with the WordNet
synonym file.

If I am not clear please let me know and I'll try and update. Here is my
current index settings.

settings: {
index: {
analysis: {
analyzer: {
synonym: {
filter: [
synonym
],
tokenizer: whitespace
},
default: {
language: English,
type: snowball
}
},
filter: {
synonym: {
type: synonym,
synonyms_path: /elasticsearch/wn_s.pl,
format: wordnet
}
}
}
}
}
}

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/26511746-d12e-4630-99bb-a4b488c0c201%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/26511746-d12e-4630-99bb-a4b488c0c201%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQComno1ywsjw%3DvhxbhT8oEqQPeWzs9DOqUFDYH-oyLA_w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: how to search non indexed field in elasticsearch

2014-11-06 Thread Ivan Brusic

You cannot search/filter on a non-indexed field.

-- 
Ivan

On Wed, Nov 5, 2014 at 11:45 PM, ramakrishna panguluri 
panguluri.ramakris...@gmail.com wrote:

 I have 10 fields inserted into elasticsearch out of which 5 fields are
 indexed.
 Is it possible to search on non indexed field?

 Thanks in advance.


 Regards
 Rama Krishna P

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/c63ac6bb-8717-470e-a5e4-01a8bd75b769%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/c63ac6bb-8717-470e-a5e4-01a8bd75b769%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQDD0JYJeX%2BCmV%3DGACekwofjUYFQvoSWQ86Th3r-MBWZtw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Announce Mailing List

2014-11-06 Thread Ivan Brusic

An announce list would be awesome, but at least something to this list with
the [ANN] or [ANNOUNCEMENT] prefix like David has been doing.

Elasticsearch 1.4.0 and 1.3.5 were released, but there is no announcement
on the list. Elasticsearch also announced a product called Shield, which
should generate lots of discussion, but once again, nothing. :)

--
Ivan

On Sun, Sep 21, 2014 at 6:20 PM, Mark Walkom ma...@campaignmonitor.com
wrote:

An announce list would be awesome :)

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com

On 22 September 2014 04:06, Igal @ getRailo.org i...@getrailo.org wrote:

I understand, but thanks to your project's success the mailing list
generates a lot of communications. and if I choose only the digest then
unless I read through the email I have no idea what happened.

every other project of this magnitude has a separate announce mailing
list where new releases and updates are announced.

IMO you should adopt that practice as well.

best,

Igal

On 9/21/2014 11:01 AM, David Pilato wrote:

I use this mailing list for announcements about plugins.

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 21 sept. 2014 à 19:56, Igal i...@getrailo.org a écrit :

Is there an announce mailing list?

ATM I am trying to cut down on the amount of emails that I get, but I
am very interested in learning about new version releases etc.
--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/b499678e-3856-4c9b-9185-c57701cd5e5d%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/b499678e-3856-4c9b-9185-c57701cd5e5d%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to a topic in the
Google Groups elasticsearch group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/k2z02vqxRqs/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/B448D5DE-3760-4E16-AF85-CE78FED17F7E%40pilato.fr
https://groups.google.com/d/msgid/elasticsearch/B448D5DE-3760-4E16-AF85-CE78FED17F7E%40pilato.fr?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
Igal Sapir
Railo Core Developerhttp://getRailo.org/

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/541F13A0.7070505%40getrailo.org
https://groups.google.com/d/msgid/elasticsearch/541F13A0.7070505%40getrailo.org?utm_medium=emailutm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAEM624buTkjEBGzJL2Lq9FSAfW5F76%2BW1bbh7x42j5GFnELPTA%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAEM624buTkjEBGzJL2Lq9FSAfW5F76%2BW1bbh7x42j5GFnELPTA%40mail.gmail.com?utm_medium=emailutm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQA6Ed6dA7V7xJg2DjZ3on6Y1pYxCzUgSqiCPNUZ3weDCw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Bool Queries and MUST/SHOULD combinations

2014-11-04 Thread Ivan Brusic

Should clauses at the same time as must clauses are only important during
queries (not filters) since they contribute to the scoring for a document.
The should clauses will improve the score for the documents that match.

-- 
Ivan

On Mon, Nov 3, 2014 at 5:51 PM, kazoompa rha...@p3g.org wrote:

 Thanks Ivan,

 We would like to create complex queries explained in this page:
 http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/combining-filters.html#bool-filter.
 I have to admit I don't see why anybody would like to put MUSTs and SHOULDs
 at same level. After further analysis, it seems that if I like to do
 something like:

 (For this example conside A, B and, C as terms filters and 'in' implying
 their possible values:)

 A in [a1, a2,...] OR B in [b1, b2,...] AND C in [c1, c2. ...]  // order
 is important


 I have to implement my bool filter as:

 {
   bool: {
 must: [
   {
 bool: {
   should: [
 {
   terms: {
 A: [
   a1,
   a2
 ]
   }
 },
 {
   terms: {
 B: [
   b1,
   b2
 ]
   }
 }
   ]
 },
 terms: {
   C: [
 c1,
 c2
   ]
 }
   }
 ]
   }
 }


 It's sort of a Polish notation for queries ;)

 Cheers.



 On Monday, November 3, 2014 5:02:15 PM UTC-5, Ivan Brusic wrote:

 Must clauses are queries that must return a document. In the first query,
 any document returned MUST have a location of Germany. The valueType should
 clause is optional and actually pointless as a filter since it does not
 contribute to scoring.

 Can you explain what your query should be doing in terms of boolean logic?

 --
 Ivan

 On Sat, Nov 1, 2014 at 4:39 PM, kazoompa rha...@p3g.org wrote:

 Hi,

 Below is my data and the two queries that I tested, first one failing
 and the latter working. I start to believe that if one wants to combine
 several SHOULD and MUST filters, the outer one must always be SHOULD. Is
 this a correct assumption? In our application, we have much more complex
 situation with several filters within each MUST and SHOULD. And lastly,
 where should place a MUST_NOT in this case?

 Many thanks.



 Here is my data:

 _index,_type,_id,_score,_source.id,_source.type,_source.valueType,_source.sentence,_source.location
 test,var,0,1,0,study,text,Lorem text is jumbled,spain
 test,var,1,1,1,study,text,bla bla bla,spain
 test,var,2,1,2,schema,decimal,ipsum,germany
 test,var,3,1,3,study,integer,lorem,france





 Here is my FAILING query:

 {
   query: {
 filtered: {
   query: {
 match_all: {}
   },
   filter: {
 bool: {
   must: {
 terms: {
   location: [
 germany
   ]
 }
   },
   should: {
 terms: {
   valueType: [
 integer
   ]
 }
   }
 }
   }
 }
   }
 }

 Here is my WORKING query returning IDs 2 and 3:

 {
   query: {
 bool: {
   should: [
 {
   terms: {
 location: [
   germany
 ]
   }
 },
 {
   bool: {
 must: [
   {
 terms: {
   valueType: [
 integer
   ]
 }
   }
 ]
   }
 }
   ]
 }
   }
 }

 --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/d4597d15-8785-4e97-9c3f-8be9aacddf9b%
 40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/d4597d15-8785-4e97-9c3f-8be9aacddf9b%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/a33a6974-4a95-4632-9c56-bea3d19ce7f0%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/a33a6974-4a95-4632-9c56-bea3d19ce7f0%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr

Re: Disabling default fields (_index, _type, _id, _score) in result list

2014-11-04 Thread Ivan Brusic

Are you using REST? If so, Jorg wrote a plugin to help with such a task:
https://github.com/jprante/elasticsearch-arrayformat

--
Ivan

On Mon, Nov 3, 2014 at 8:36 AM, Lasse Schou lassesc...@gmail.com wrote:

Hi,

I want to know if it's possible to disable the _index, _type, _id
and _score fields in the output list when performing a search query.

Example:

hits: [
{
_index: eventlist_2014_10,
_type: eventlist,
_id: lcJu1j5Jvyh9ywJHsPplXA_300343y7a0ktK4iXjeccFse_EDPw_2
,
_score: 1,
fields: {
.

Those fields add a dramatic overhead to my result list, so I'd really like
to disable them.

Thanks in advance,
Lasse

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/1d870c9d-0714-4a78-b888-f2bb810db966%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/1d870c9d-0714-4a78-b888-f2bb810db966%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQAofPTp0xrPL-8AnbWQuNYwm3JBBvym%3DqoyZSyb1QJhOQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Bool Queries and MUST/SHOULD combinations

2014-11-03 Thread Ivan Brusic

Must clauses are queries that must return a document. In the first query,
any document returned MUST have a location of Germany. The valueType should
clause is optional and actually pointless as a filter since it does not
contribute to scoring.

Can you explain what your query should be doing in terms of boolean logic?

-- 
Ivan

On Sat, Nov 1, 2014 at 4:39 PM, kazoompa rha...@p3g.org wrote:

 Hi,

 Below is my data and the two queries that I tested, first one failing and
 the latter working. I start to believe that if one wants to combine several
 SHOULD and MUST filters, the outer one must always be SHOULD. Is this a
 correct assumption? In our application, we have much more complex situation
 with several filters within each MUST and SHOULD. And lastly, where should
 place a MUST_NOT in this case?

 Many thanks.



 Here is my data:

 _index,_type,_id,_score,_source.id,_source.type,_source.valueType,_source.sentence,_source.location
 test,var,0,1,0,study,text,Lorem text is jumbled,spain
 test,var,1,1,1,study,text,bla bla bla,spain
 test,var,2,1,2,schema,decimal,ipsum,germany
 test,var,3,1,3,study,integer,lorem,france





 Here is my FAILING query:

 {
   query: {
 filtered: {
   query: {
 match_all: {}
   },
   filter: {
 bool: {
   must: {
 terms: {
   location: [
 germany
   ]
 }
   },
   should: {
 terms: {
   valueType: [
 integer
   ]
 }
   }
 }
   }
 }
   }
 }

 Here is my WORKING query returning IDs 2 and 3:

 {
   query: {
 bool: {
   should: [
 {
   terms: {
 location: [
   germany
 ]
   }
 },
 {
   bool: {
 must: [
   {
 terms: {
   valueType: [
 integer
   ]
 }
   }
 ]
   }
 }
   ]
 }
   }
 }

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/d4597d15-8785-4e97-9c3f-8be9aacddf9b%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/d4597d15-8785-4e97-9c3f-8be9aacddf9b%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQCgojLxpzSrYQyW1%3DfeaF_TJdkx4dqgaxq0_sijvq6dvw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Using function_score error

2014-10-29 Thread Ivan Brusic

Mvel has been removed in recent versions of Elasticsearch due to security
issues. Either change your script to use Groovy (preferred) or install the
mvel plugin.

Cheers,

Ivan
 On Oct 29, 2014 2:44 PM, Manuel Sciuto msci...@viajeros.com wrote:

 Hello everyone


 Do not understand why it does not work

 # Create some docs
 PUT /searchtube/video/1
 {
   title: Sick Sad World: Cold Breeze on the Interstate,
   description: Is your toll collector wearing pants, a skirt, or
 nothing but a smile? Cold Breeze on the Interstate, next on Sick, Sad
 World.,
   views: 500,
   likes:2,
   created_at: 2014-04-22T08:00:00
 }

 PUT /searchtube/video/2
 {
 title: Sick Sad World: The Severed Pianist,
   description: When he turned up his nose at accordion lessons, they
 cut off his inheritance molto allegro. The Severed Pianist, next on Sick,
 Sad World.,
   views: 6000,
   likes: 100,
   created_at: 2014-04-22T12:00:00
 }

 #SEARCH FUCNTION_SCORE
 GET /searchtube/_search
 {
   query: {
 function_score: {
   query: {match: {_all: severed}},
   script_score: {
 script: _score * log(doc['likes'].value + doc['views'].value +
 1)
   }
 }
   }
 }


 Error Response

 {
error: SearchPhaseExecutionException[Failed to execute phase
 [query], all shards failed; shardFailures
 {[vrJl1dg1RV2wqGZ2Hqv3zQ][searchtube][0]:
 SearchParseException[[searchtube][0]: from[-1],size[-1]: Parse Failure
 [Failed to parse source [{\n  \query\: {\n\function_score\: {\n
  \query\: {\match\: {\_all\: \severed\}},\n  \script_score\:
 {\n\script\: \_score * log(doc['likes'].value +
 doc['views'].value + 1)\\n  }\n}\n  }\n}\n]]]; nested:
 QueryParsingException[[searchtube] script_score the script could not be
 loaded]; nested: ScriptException[dynamic scripting for [mvel] disabled];
 }{[vrJl1dg1RV2wqGZ2Hqv3zQ][searchtube][1]:
 SearchParseException[[searchtube][1]: from[-1],size[-1]: Parse Failure
 [Failed to parse source [{\n  \query\: {\n\function_score\: {\n
  \query\: {\match\: {\_all\: \severed\}},\n  \script_score\:
 {\n\script\: \_score * log(doc['likes'].value +
 doc['views'].value + 1)\\n  }\n}\n  }\n}\n]]]; nested:
 QueryParsingException[[searchtube] script_score the script could not be
 loaded]; nested: ScriptException[dynamic scripting for [mvel] disabled];
 }{[vrJl1dg1RV2wqGZ2Hqv3zQ][searchtube][2]:
 SearchParseException[[searchtube][2]: from[-1],size[-1]: Parse Failure
 [Failed to parse source [{\n  \query\: {\n\function_score\: {\n
  \query\: {\match\: {\_all\: \severed\}},\n  \script_score\:
 {\n\script\: \_score * log(doc['likes'].value +
 doc['views'].value + 1)\\n  }\n}\n  }\n}\n]]]; nested:
 QueryParsingException[[searchtube] script_score the script could not be
 loaded]; nested: ScriptException[dynamic scripting for [mvel] disabled];
 }{[vrJl1dg1RV2wqGZ2Hqv3zQ][searchtube][3]:
 SearchParseException[[searchtube][3]: from[-1],size[-1]: Parse Failure
 [Failed to parse source [{\n  \query\: {\n\function_score\: {\n
  \query\: {\match\: {\_all\: \severed\}},\n  \script_score\:
 {\n\script\: \_score * log(doc['likes'].value +
 doc['views'].value + 1)\\n  }\n}\n  }\n}\n]]]; nested:
 QueryParsingException[[searchtube] script_score the script could not be
 loaded]; nested: ScriptException[dynamic scripting for [mvel] disabled];
 }{[vrJl1dg1RV2wqGZ2Hqv3zQ][searchtube][4]:
 SearchParseException[[searchtube][4]: from[-1],size[-1]: Parse Failure
 [Failed to parse source [{\n  \query\: {\n\function_score\: {\n
  \query\: {\match\: {\_all\: \severed\}},\n  \script_score\:
 {\n\script\: \_score * log(doc['likes'].value +
 doc['views'].value + 1)\\n  }\n}\n  }\n}\n]]]; nested:
 QueryParsingException[[searchtube] script_score the script could not be
 loaded]; nested: ScriptException[dynamic scripting for [mvel] disabled];
 }],
status: 400
 }



 What is this doing wrong? It is an example that I found in
 https://www.found.no/foundation/function-scoring/

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/93f73abd-61b1-4569-bc2a-10b526e08b81%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/93f73abd-61b1-4569-bc2a-10b526e08b81%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit

Re: How is it calculated _score

2014-10-28 Thread Ivan Brusic

The default scoring algorithm is based on TF-IDF.

http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/practical-scoring-function.html
http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/scoring-theory.html

You can enable explain to see how documents are scored:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-explain.html

Without knowing more about your system, I suspect it is the IDF that is
causing the mismatch. The IDF is calculated per shard, so if when your
documents come from different shards, the scores can be different. Try
using a distributed search type (dfs_query_then_fetch) to see if the issue
still persists:
http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/_search_options.html#search-type

Cheers,

Ivan

On Tue, Oct 28, 2014 at 3:38 PM, Manuel Sciuto msci...@viajeros.com wrote:

 Hello

 How is score calculated?

 GET /business/actividades,alojamiento,comida,transporte__servicios/_search
 {
   query: {
 filtered: {
   query: {
 match: {
   name: Sheraton
 }
   }
 }
   }
 }

 Response

 {
took: 4,
timed_out: false,
_shards: {
   total: 5,
   successful: 5,
   failed: 0
},
hits: {
   total: 506,
   max_score: 6.8087983,
   hits: [
  {
 _index: business,
 _type: alojamiento,
 _id: 273825,
 _score: 6.8087983,
 _source: {
name: Sheraton,
reviews: 2
 }
  },
  {
 _index: business,
 _type: alojamiento,
 _id: 252355,
 _score: 6.8087983,
 _source: {
name: Sheraton,
reviews: 1
 }
  },
  {
 _index: business,
 _type: alojamiento,
 _id: 132774,
 _score: 6.8087983,
 _source: {
name: Sheraton,
reviews: 1
 }
  },
  {
 _index: business,
 _type: alojamiento,
 _id: 225509,
 _score: 6.8087983,
 _source: {
name: Sheraton,
reviews: 2
 }
  },
  {
 _index: business,
 _type: alojamiento,
 _id: 232124,
 _score: 6.8087983,
 _source: {
name: Sheraton,
reviews: 1
 }
  },
  {
 _index: business,
 _type: alojamiento,
 _id: 219172,
* _score: 6.8087983,*
 _source: {
name: Sheraton,
reviews: 0
 }
  },
  {
 _index: business,
 _type: alojamiento,
 _id: 224180,
   *  _score: 6.7636743,*
 _source: {
name: Sheraton,
reviews: 3
 }
  },
  {
 _index: business,
 _type: alojamiento,
 _id: 268979,
 _score: 6.7636743,
 _source: {
name: Sheraton,
reviews: 12
 }
  },
  {
 _index: business,
 _type: alojamiento,
 _id: 228353,
 _score: 6.7636743,
 _source: {
name: Sheraton,
reviews: 2
 }
  },
  {
 _index: business,
 _type: alojamiento,
 _id: 112508,
 _score: 6.7636743,
 _source: {
name: Sheraton,
reviews: 9
 }
  }
   ]
}
 }

 Because the score is different in some cases? If the name is the same

 Thanks!!


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/bcaf4e31-f64a-4cc7-8b2f-986212216b9c%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/bcaf4e31-f64a-4cc7-8b2f-986212216b9c%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQCmXPjzLqmz_J8%3DKHYDHOM92yX5EVj1CePWCzW%3DMkmYuA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: How is it calculated _score

2014-10-28 Thread Ivan Brusic

,
  description: termFreq=1.0
   }
]
 },
 {
value: 6.7636743,
description: idf(docFreq=90, maxDocs=28985)
 },
 {
value: 1,
description: fieldNorm(doc=24214)
 }
  ]
   }
]
 }
  }
   ]
}
 }


 El martes, 28 de octubre de 2014 16:47:56 UTC-3, Ivan Brusic escribió:

 The default scoring algorithm is based on TF-IDF.

 http://www.elasticsearch.org/guide/en/elasticsearch/guide/
 current/practical-scoring-function.html
 http://www.elasticsearch.org/guide/en/elasticsearch/guide/
 current/scoring-theory.html

 You can enable explain to see how documents are scored:
 http://www.elasticsearch.org/guide/en/elasticsearch/
 reference/current/search-request-explain.html

 Without knowing more about your system, I suspect it is the IDF that is
 causing the mismatch. The IDF is calculated per shard, so if when your
 documents come from different shards, the scores can be different. Try
 using a distributed search type (dfs_query_then_fetch) to see if the issue
 still persists:
 http://www.elasticsearch.org/guide/en/elasticsearch/guide/
 current/_search_options.html#search-type

 Cheers,

 Ivan

 On Tue, Oct 28, 2014 at 3:38 PM, Manuel Sciuto msc...@viajeros.com
 wrote:

 Hello

 How is score calculated?

 GET /business/actividades,alojamiento,comida,transporte_
 _servicios/_search
 {
   query: {
 filtered: {
   query: {
 match: {
   name: Sheraton
 }
   }
 }
   }
 }

 Response

 {
took: 4,
timed_out: false,
_shards: {
   total: 5,
   successful: 5,
   failed: 0
},
hits: {
   total: 506,
   max_score: 6.8087983,
   hits: [
  {
 _index: business,
 _type: alojamiento,
 _id: 273825,
 _score: 6.8087983,
 _source: {
name: Sheraton,
reviews: 2
 }
  },
  {
 _index: business,
 _type: alojamiento,
 _id: 252355,
 _score: 6.8087983,
 _source: {
name: Sheraton,
reviews: 1
 }
  },
  {
 _index: business,
 _type: alojamiento,
 _id: 132774,
 _score: 6.8087983,
 _source: {
name: Sheraton,
reviews: 1
 }
  },
  {
 _index: business,
 _type: alojamiento,
 _id: 225509,
 _score: 6.8087983,
 _source: {
name: Sheraton,
reviews: 2
 }
  },
  {
 _index: business,
 _type: alojamiento,
 _id: 232124,
 _score: 6.8087983,
 _source: {
name: Sheraton,
reviews: 1
 }
  },
  {
 _index: business,
 _type: alojamiento,
 _id: 219172,
* _score: 6.8087983,*
 _source: {
name: Sheraton,
reviews: 0
 }
  },
  {
 _index: business,
 _type: alojamiento,
 _id: 224180,
   *  _score: 6.7636743,*
 _source: {
name: Sheraton,
reviews: 3
 }
  },
  {
 _index: business,
 _type: alojamiento,
 _id: 268979,
 _score: 6.7636743,
 _source: {
name: Sheraton,
reviews: 12
 }
  },
  {
 _index: business,
 _type: alojamiento,
 _id: 228353,
 _score: 6.7636743,
 _source: {
name: Sheraton,
reviews: 2
 }
  },
  {
 _index: business,
 _type: alojamiento,
 _id: 112508,
 _score: 6.7636743,
 _source: {
name: Sheraton,
reviews: 9
 }
  }
   ]
}
 }

 Because the score is different in some cases? If the name is the same

 Thanks!!


  --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/bcaf4e31-f64a-4cc7-8b2f-986212216b9c%
 40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/bcaf4e31-f64a-4cc7-8b2f

Re: plan for river

2014-10-27 Thread Ivan Brusic

There is nothing magical about rivers. With some Java code changes, most
rivers can be made to run as standalone Java processes. The only thing the
rivers do is (weakly) guarantee that only one river instance is run per
cluster.

Cheers,

Ivan

On Mon, Oct 27, 2014 at 4:11 AM, joergpra...@gmail.com
joergpra...@gmail.com wrote:

Yes, there is already a substitution, the JDBC feeder in the JDBC river
repo.

Future versions of JDBC river will no longer rely on the river API.

Jörg

On Mon, Oct 27, 2014 at 12:49 AM, Mungeol Heo mungeol@gmail.com
wrote:

Hi,

My question is that will es remove all river related plugin in the future?
If it will, I'd like to know that is there substitution for JDBC?
Thanks.

Best regards,

- Mungeol

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/cc6d541f-1609-4218-932b-064a27e9692a%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/cc6d541f-1609-4218-932b-064a27e9692a%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFk3x%2BTzscHBohYbiHUb-By%2BVv9w5OGx7vj8hj0oE7MRQ%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFk3x%2BTzscHBohYbiHUb-By%2BVv9w5OGx7vj8hj0oE7MRQ%40mail.gmail.com?utm_medium=emailutm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQBOYCE7rMYC6j46owedA0TVPid3tV5RXAVSTyJfPmXiow%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Aggregation buckets, with an additional key:value inside.

2014-10-25 Thread Ivan Brusic

I maintain a mapping on the client side to due the lookups. Thankfully my
taxonomy is static (but somewhat large). There is a PR to do server-side
mappings, but I don't think it would apply to aggregations and is quite old.

An alternative solution would be to create compound values such as
48885:Car Rental and decompose the value on the client side, but this
would create a string aggregation, which could have slower performance.

Cheers,

Ivan

On Fri, Oct 24, 2014 at 5:50 PM, Cody Stringham cs.nega...@gmail.com
wrote:

 Hey everyone,

 These aggregations are working out great, but I need to return more than
 one value in the bucket so we can use them in our API. The basic idea is
 that we aggregate all of the category id's, but we also want the
 category_name to be included in that same bucket for ease of use.


 *Mapping:*
 categories : {
 properties : {
 category_name : {
 analyzer : keyword,
 type : string
 },
 category_id : {
 type : integer
 },
 parent_id : {
 type : integer
 }
 }
 }

 *Aggs:*
 aggs: {
   categories: {
 terms: {
   size: 130,
   field: categories.category_id
 }
   },


 *Returns (actual):*

 category_stats: [
 {
   category_id: 58,
   offer_count: 48885
 },
 {
   category_id: 1008,
   offer_count: 44530
 },

 ...



 *Returns (desired):*

 category_stats: [
 {
   category_name: Car Rental,
   category_id: 58,
   offer_count: 48885
 },
 {
   category_name: Fast Food,
   category_id: 1008,
   offer_count: 44530
 },

 ...

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/055d0c2b-f4ba-455b-883f-587c09b61582%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/055d0c2b-f4ba-455b-883f-587c09b61582%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQB-Asuq5igGP-mJQ7RGv4t2CjsjryBGSTPDn0EAb-vfZw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Migration of 0.90.3 cluster to new cluster running 1.3.4

2014-10-24 Thread Ivan Brusic

Unless you are moving to new hardware, there is no need to rsync your data.
Both Elasticsaerch 0.90.x and 1.3.x are based on Lucene 4, so the
underlying data is compatible. Of course, you should backup your data
before such an upgrade.

After restarting your new cluster with your old data, I would run an
optimize on your indices so that Lucene can upgrade all your segments into
the new format. There have been some issues with Lucene format
incompatibilities, but they usually deal with indices with beta Lucene
versions.

You cannot bring up a mixed cluster between 0.90 and 1.x, so you would need
to stop all your VMs. Why are you interested in primary shards?
Elasticsearch is not like most database where the primary node has an extra
special connotation. I have not played around with shard allocation much,
but here is an old article:
http://blog.sematext.com/2012/05/29/elasticsearch-shard-placement-control/

Cheers,

Ivan

On Thu, Oct 23, 2014 at 4:18 PM, Magnus Persson magnus.e.pers...@gmail.com
wrote:

Ah, slight typo in regard to the old cluster. It is 1 replica per index.

On Thursday, October 23, 2014 10:13:57 PM UTC+2, Magnus Persson wrote:

So I'm about to upgrade to 1.3.4, but due to some unfortunate
circumstances I need to migrate my ES cluster to new VMs.
The environment is fairly simple. At the top I have logstash agent
pulling messages off of a Redis server and feeding it to my 2 node cluster
(2 replicas, 2 shards per index). So for what it's worth I can stop
logstash and the cluster will essentially stop indexing data, allowing me
to shut it down without issue. Once I have the old cluster shut down, I
intend to rsync it over to the new cluster which is 3 nodes (2 replicas, 3
shards per index).
What is the best approach here? I was thinking that I could rsync the
data folder from 1 of my 2 VMs running on the old cluster but then I
realized that the primary shard for each index might not be on that VM. Can
I manually set the primary shard somehow?

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/ee5aa6d1-3339-4d45-8cd6-76614269e501%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/ee5aa6d1-3339-4d45-8cd6-76614269e501%40googlegroups.com?utm_medium=emailutm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQB_R8bj9mNSASWJVpGZwR5JYJSdu6bk_5DvzxPgtbU-Bg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: OutOfMemory

2014-10-23 Thread Ivan Brusic

There is more to the issue than merely your configuration. What are your
queries? Are you doing a lot of aggregations, especially on on
high-cardinality fields. What kind of hardware are you running now?

Using the API, looks at your field cache usage. The field cache is held
within the Java heap space, which would cause the out of memory issues.

Cheers,

Ivan

On Thu, Oct 23, 2014 at 1:26 PM, Eike Dehling e...@buzzcapture.com wrote:

Hi all,

We are using elasticsearch in our production backing dashboards with
social media data. We are running 0.90 version.

Our volume of data grows every day, so we semi-regularly add servers to
our cluster to keep things running smooth. A year ago we went from 4 nodes
to 6 nodes. One month ago we have gone from 6 to 8 nodes. Only now, we are
seeing OutOfMemory like issues again.

My question: It surprises me that we are hitting resource limits again so
soon, increase in data does not explain that very well. Any suggestions for
causes?

Best regards,
Eike dehling

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/b7ca0070-84bc-4325-8a36-f8600289ae65%40googlegroups.com
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQAnMdwVi-xq3oXEsYKROtcDwhR58pnZFsShLhmXvVg5VA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Shard Recommendation for Elasticsearch

2014-10-19 Thread Ivan Brusic

Each shard is a Lucene index, so it will consume resources at the file
system level. Elasticsearch itself will be able to handle the coordination
between many shards. You next need to think about how much data each shard
actually has. Distributed logging can create volumes of logs, perhaps too
much for a 4 node cluster.

--
Ivan

On Sun, Oct 19, 2014 at 6:07 AM, elor...@gmail.com wrote:

Hi Ivan,

Thanks for the reply. So if I store data, one index per day, across 6 data
nodes (4 or 5 shards each node) for a year..that's something like 10,000
shards in the cluster. Does that make sense? And also, is this safe?

On Saturday, October 18, 2014 2:41:50 PM UTC-4, Ivan Brusic wrote:

The number of shards will help you scale out in case you add more nodes
in the future. With your current shard count at 5, you cannot optimally
deploy and distribute a 6+ node cluster. However, your data is time-based,
one per day. Are queries on historical data important? I would start off
with a shard count of 4 per index, letting node receive part of the index
(ideally more of the index with replication) and then change the shard
count in case you increase your cluster. Your older indices may not be
optimally distributed, but your new ones, and presumedly your more
important ones, will be.

Cheers,

Ivan

On Sat, Oct 18, 2014 at 7:04 AM, elo...@gmail.com wrote:

Hi All,

I currently am planning on building out to a 4 Elasticsearch data node
cluster from currently at 2 and have a question regarding how many shards
to use for the indexes. I am running the ELK stack and currently each index
file, one per day, is creating 5 shards per node. As you can imagine this
will create a lot of shards across the nodes over a period of time. I have
read that having too many shards is bad for the cluster's health. Is there
a better way to calculate the best shard / replica strategy to avoid issues
but maintain redundancy? Thanks for your help.

--
You received this message because you are subscribed to the Google
Groups elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/00ff2f14-b2ae-4141-82ca-05872b94d673%
40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/00ff2f14-b2ae-4141-82ca-05872b94d673%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/0070b801-5d67-4103-91d7-e9907b4af97b%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/0070b801-5d67-4103-91d7-e9907b4af97b%40googlegroups.com?utm_medium=emailutm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQDEvP8Rx%3DKauSdS7NfNDqmy_e6i6oQbwDtJ%2Bx-8x_rmRg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: running master nodes on application servers

2014-10-18 Thread Ivan Brusic

Is there some reason why your Elasticsearch nodes cannot serve as both
master and data? I believe that dedicated master nodes should only come
into play with large clusters, way beyond the 3 you have. If your master
nodes are tied to your app nodes, then I believe you will have less
resiliency since app servers are the ones the normally are rebooted in case
of issues.

--
Ivan

On Fri, Oct 17, 2014 at 10:59 PM, webish greg...@yoursports.com wrote:

I have a cluster of 3 r3.2xlarge aws servers. Each is a master and data
node.

Has anyone considered running a cluster of data nodes and then running
master nodes on each application server? Currently my application servers
have auto scaling policies.

The application server is pointing to a single ES node which isn't
distributing the load very well. Also, it has created a single point of
failure for the cluster.

I'm thinking to change the 3 r3.2xlarge servers to be data only nodes and
each app node would had it's own master node used to communicate with ES...

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/528e269a-f0bd-41a3-86cc-06bd40d502f9%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/528e269a-f0bd-41a3-86cc-06bd40d502f9%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQD%2B0ZW%2B2uw%2BmrrinRZ6X%3DxMZhQ8iijbXFLyWr9_zZu_hg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Filters: odd behavior

2014-10-18 Thread Ivan Brusic

The structure of your query is odd. Either it is some format that I am not
aware of or the Elasticsearch parser is not doing a good job at determining
it is invalid.

Your two filters should be joined via a bool filter. Sometime like (not
tested):

{
  query: {
filtered: {
  query: {

  },
  filter: {
bool: {
  must: [
{
  geo_distance: {
distance: 30km,
Location.location: {
  lat: -32.890183,
  lon: -68.844050
}
  }
},
{
  not: {
filter: {
  query: {
terms: {
  _all: [
sex,
xxx,
sexshop
  ]
}
  }
}
  }
}
  ]
}
  }
}
  }
}

-- 
Ivan

On Sat, Oct 18, 2014 at 7:04 AM, @mromagnoli marce.romagn...@gmail.com
wrote:

 Thanks guys for your responses.

 My question was due an strange behavior when using 'not' and
 'geo_distance' filters.

 I want to filter some results that have undesirable words, such as 'sex',
 'xxx', etc... And then geo filter those good results, but if I place 'not'
 filter first, then when geo filter is applied, it retrieves results with
 that not wanted words in it. If I place geo filter first, and then 'not'
 filter, geo filter seems not to be executed or something, because results
 are not accurate for that filter, instead, they still being good results
 with no bad words.

 I am using it like this:


 {
 query: {
 filtered: {
 query: {...},
 filter: [
 [{
 geo_distance: {
 distance: 30km,
 Location.location: {
 lat: -32.890183,
 lon: -68.844050
 }
 }
 }],
 [{
 not: {
 filter: {
 query: {
 terms: {
 _all: [sex, xxx, sexshop]
 }
 }
 }
 }
 }]
 ]
 }
 },
 from: 0,
 size: 10,
 sort: {
 _geo_distance: {
 Location.location: {
 lat: -32.890183,
 lon: -68.844050,
 order: desc
 }
 },
 _score: desc
 }
 }


 It seems like the last filter is executed correctly.

 Thanks,

 Marce


 El jueves, 16 de octubre de 2014 09:16:43 UTC-3, @mromagnoli escribió:

 Hi everyone,
 I have a doubt about Filters.

 If I have more than one filter, in a filtered query, are they executed in
 the defined order? And, are they filtering in a 'chain' mode, i.e. using
 the results of the previous filters?

 Thanks in advance as always.

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/4ed1e0ad-9654-40a6-a48a-b753c102d0ae%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/4ed1e0ad-9654-40a6-a48a-b753c102d0ae%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQBZoCpy_nbxVTM9v2%2BR7Y2jmAzGMtsRSZ8J-CVgm_yawQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Shard Recommendation for Elasticsearch

2014-10-18 Thread Ivan Brusic

The number of shards will help you scale out in case you add more nodes in
the future. With your current shard count at 5, you cannot optimally deploy
and distribute a 6+ node cluster. However, your data is time-based, one per
day. Are queries on historical data important? I would start off with a
shard count of 4 per index, letting node receive part of the index (ideally
more of the index with replication) and then change the shard count in case
you increase your cluster. Your older indices may not be optimally
distributed, but your new ones, and presumedly your more important ones,
will be.

Cheers,

Ivan

On Sat, Oct 18, 2014 at 7:04 AM, elor...@gmail.com wrote:

Hi All,

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/00ff2f14-b2ae-4141-82ca-05872b94d673%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/00ff2f14-b2ae-4141-82ca-05872b94d673%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQDkRSekQJzJY5S%2BZ4wSUKK-YhSGmvWP%3D9-d46u1wAaAFw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Get only ids with no source Java API

2014-10-17 Thread Ivan Brusic

Have you tried setting no fields to be returned or the explicit
setNoFields() method?

http://jenkins.elasticsearch.org/job/Elasticsearch%20Master%20Branch%20Javadoc/Elasticsearch_API_Documentation/org/elasticsearch/action/search/SearchRequestBuilder.html#setNoFields()

-- 
Ivan

On Thu, Oct 16, 2014 at 2:45 AM, Ilija Subasic subasic.il...@gmail.com
wrote:

 Is there a way in elasticsearch using JAVA API to get only the ids of the
 documents returned for a give query.

 SearchResponse sr =
  esClient.prepareSearch(index).setSize(resultSize).setQuery(q).setScroll(new
 TimeValue(1)).setQuery(fqb).setFetchSource(false).get();

 but I get empty hits (`sr.getHits().hits[].length == 0`) althouh the total
 count of returned hits is 0 (`sr.getHits().getTotalHits == 2`). I
 understand that nothing is returned by elasticsearch because I set fetch
 source to false, but the ids should somehow be available. My current
 solution is:

 SearchResponse sr =
  esClient.prepareSearch(index).setSize(resultSize).setQuery(q).setScroll(new
 TimeValue(1)).setQuery(fqb).srb.setFetchSource(_id, null).get();

 However I think that gets the _id field from source, and for speed I would
 like to avoid this if possible.

 Thanks,
 Ilija

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/089b670f-763c-4795-859a-720767d24a81%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/089b670f-763c-4795-859a-720767d24a81%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQCYskSm7WyTDX5LVCrcL%2BR5y%2B2e9fUBTH0Z8iamu06OBg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Filters: odd behavior

2014-10-17 Thread Ivan Brusic

They are indeed executed in the defined order. Filters that are more
specific should be placed early on and those that cannot be cached
(geo/timebased) should be placed last.

Cheers,

Ivan

On Thu, Oct 16, 2014 at 5:16 AM, @mromagnoli marce.romagn...@gmail.com
wrote:

Hi everyone,
I have a doubt about Filters.

If I have more than one filter, in a filtered query, are they executed in
the defined order? And, are they filtering in a 'chain' mode, i.e. using
the results of the previous filters?

Thanks in advance as always.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/d528067f-5042-4667-bcbc-38dcde87010a%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/d528067f-5042-4667-bcbc-38dcde87010a%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQASJatGfPg2kP%3D8soiHvvxKDZKJ6qkK0FyfZT4B2x_7Qw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Update similarity measure for existing index

2014-10-09 Thread Ivan Brusic

You cannot change the similarity on an existing index. There is no
technical measure why it could not occur, it appears to be simply a method
in place to prevent users from creating potentially huge errors. I say that
developers should have the option to shoot themselves in the foot!

Cheers,

Ivan

On Thu, Oct 9, 2014 at 6:58 AM, CC cr1st1na.garba...@gmail.com wrote:

I have an existing index for which the default ElasticSearch similarity is
used for all fields. I would like to update this index and set some other
type of similarity, like BM25. The query I tried is:

curl -XPOST 'http://localhost:9200/myindex/' -d
'{settings:{similarity:{newSimilarity:{type:BM25'

However, this crashes with an IndexAlreadyExists exception. Still, is it
possible to update the similarity measure for all fields inside this index
without having to reindex the data?

Thanks!

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/d6463b4f-129b-41de-8fbc-bbc446887b54%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/d6463b4f-129b-41de-8fbc-bbc446887b54%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQA1zQg%3DVwz2JwRnzeAFqEs0-d5Eg-egM6qBoFc4OV_JZQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Filter by specific value without mapping

2014-10-07 Thread Ivan Brusic

The field do not need a custom analyzer, they just need to be simply marked
as non_analyzed.

You can setup a dynamic template that states any new field should be non
analyzed.

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-root-object-type.html#_dynamic_templates

You can still hardcode the mapping for specific fields.

Cheers,

Ivan
On Oct 7, 2014 2:57 AM, Vladimir Krylov s6nu...@gmail.com wrote:

What I'm trying to do is to get data by filtering term with exact
matching. I have ES 1.3.2 and I cannot do mapping, as attributes are
dynamic (different users has different attributes). My data:

{ id: 111,
org_id: 11,
approval: approved,
...
}

{ id: 112,
org_id: 11,
approval: not approved,
...
}

This request returns results:

curl -X GET 'host:9200/data/_search?pretty' -d '{
filter:{
term:{
approval:approved
}
}
}

But this not:

curl -X GET 'host:9200/reports/_search?pretty' -d '{
filter:{
term:{
approval:not approved
}
}
}

It's a dup of ticket
https://github.com/elasticsearch/elasticsearch/issues/8006#issuecomment-58160111
As I was followed here.

David Pilato proposed that index has indexed probably not and approved
and that there is no exact matching to not approved. I tried to search
for word not and it works

curl -X GET 'host:9200/reports/_search?pretty' -d '{
filter:{
term:{
approval:not
}
}
}

So, how can I filter by exact word matching not approved?

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/6ea46153-68a7-481f-9064-9e094094bf29%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/6ea46153-68a7-481f-9064-9e094094bf29%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQDpvwRMPmLTX2ung%2BS6o8nz0k2mMtgmvZwEQMSCygRAPw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Understanding doc_values?

2014-10-07 Thread Ivan Brusic

Perhaps it is easier to talk about the downsides of doc_values.

If you have slow disks, common when using low level VMs with shared disks,
then retrieving your data will be much slower.

Also, you cannot filter on doc_values fields, so it depends on your other
use cases.

The amount field seems like a good candidate for doc_values, but it depends
on the downsides I highlighted above.

Cheers,

Ivan
On Oct 7, 2014 6:11 AM, Michaël Gallego mich...@maestrooo.com wrote:

Hi,

With the release of Elasticsearch 1.4, I discovered about doc_values.
However their use remains a bit obscure for me, and the documentation
didn't help. As I understand it, it is mostly useful when performing
aggregations, as it allows to reduce the memory amount of data loaded in
memory. The doc recommends to specify not_analyzed string as doc_values, as
well as values that are used for aggregations. But for instance, if my
aggregations are about summing one value called amount, does it make the
amount integer/double as a good candidate for doc_values, or is it only
useful for properties that are space consuming?

In overall, if my use case is nearly only aggregations, should I go the
way of setting all proeprties as doc_values, except the analyzed strings?

Thanks!

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/0d204979-3403-4b07-9782-c4b52120f7e9%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/0d204979-3403-4b07-9782-c4b52120f7e9%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQC9HouWNV8qmGaHXfR%2BuTbojybD%3DBYpT9woMadnMEHdaA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Pattern replace apostrophes?

2014-10-07 Thread Ivan Brusic

What type of query are you using? Perhaps the query you are using is not
using the same analyzer at search time.

--
Ivan

On Tue, Oct 7, 2014 at 6:06 AM, Lee Gee lee...@gmail.com wrote:

My users have issues with apostrophes: I need to index and search aaa's
as it is, and without the apostrophe, as aaas.

If I use a char_filter to remove apostrophes when indexing and when
searching, the _analyze endpoint shows me that they produce 'words' without
apostrophes like this (respectively):

{... {
end_offset = 5,
position = 1,
start_offset = 0,
token = aaas,
type = word,
} }

{
end_offset = 5,
position = 1,
start_offset = 0,
token = aaas,
type = word,
},

But there seems to be nothing I can do to find aaas / aaa's when
searching!

Is this expected?

TIA
Lee

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/a959fe9f-6899-47fd-a371-131c1e51071c%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/a959fe9f-6899-47fd-a371-131c1e51071c%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQCOEYRTEuv75sRjh20iYS-tG%2BgBKJp6Dc75AexUkyYetQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Recommendation on reading the heart of the code.

2014-10-06 Thread Ivan Brusic

There is no voting or other gamification, just a plain ol' mailing list.
Many of us respond as just another way to contribute to open-source.

--
Ivan

On Mon, Oct 6, 2014 at 3:37 AM, ahmed jamal maaz jamalahmedm...@gmail.com
wrote:

Hi all,

These are very good advises. I really appreciate it.

I am planning to start all of these in parallel and find which one will
suite me the best.

Thank you: Joerg, Isabel, Kevin and Ivan.

I have a question: How do we vote an answer (like Quora) here (is this
available with Google groups).

Obliged
Jamal

On Mon, Oct 6, 2014 at 2:45 PM, joergpra...@gmail.com
joergpra...@gmail.com wrote:

A great start is by studying and writing plugins.

Elasticsearch is one of the rare masterpieces of software that allow to
plug in code that you have authored to extend functionality, without
forking the main code base.

There are a lot of plugins out there with small code base and easy to
study, e.g. on github.

A very nice side effect is that you can open source your plugins so you
can give something back to the community, and also have more eyes on your
code to find and fix bugs.

Like Ivan already said, if you want to dive deeper into the ES API, look
at the ES tests. The Java test code base is extensive and covers almost all
use cases. You will find usage examples of the API all over the place.

Jörg

On Sun, Oct 5, 2014 at 5:44 PM, ahmed jamal maaz
jamalahmedm...@gmail.com wrote:

Hi,

I am a newbie to elasticsearch and lucene. I am going through its source
code. I like the general idea of search on Lucene.

My Problem:
There is tons of code elasticsearch + lucene in the project. I am
requesting (curl url) and going through step by step (in debug mode). Are
there any recommendations and tips which can help me in increasing the pace
of this. This is taking too long. I don't have any issues, i will spend
time. But are there any parts which i should look first and then the
others. Such that i can learn complete and in a structured manner.

Current State:
I have touch the bottom of insertion (creating new index) and general
querying.

Thanks in advance.

Obliged
Jamal

--
You received this message because you are subscribed to the Google
Groups elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/834b9e4d-5564-4471-9564-33b81d9f2935%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/834b9e4d-5564-4471-9564-33b81d9f2935%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the
Google Groups elasticsearch group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/vVk99L_i1rs/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGiv7AWWk7nkOcyrPazxVz7ZkPSUcTC0khN0JJzW04dzA%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGiv7AWWk7nkOcyrPazxVz7ZkPSUcTC0khN0JJzW04dzA%40mail.gmail.com?utm_medium=emailutm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CA%2Ba4NmVvYWEamanOz-3%2B9ZwRNf8ADwBXN%2BHuf2tgHA-Ru%3D3OsQ%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CA%2Ba4NmVvYWEamanOz-3%2B9ZwRNf8ADwBXN%2BHuf2tgHA-Ru%3D3OsQ%40mail.gmail.com?utm_medium=emailutm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQBkNchnPU-TrMMX%3D4b5T7m%3Dck_3LMwR8GP48RiWhzhNRA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Recommendation on reading the heart of the code.

2014-10-05 Thread Ivan Brusic

The code is difficult to debug due to the distributed nature of
Elasticsearch. Requests get serialized and are sent via a binary protocol,
so you cannot focus on specific classes. Dependency injection adds in
additional complexity. You cannot simply determine the construction time
values in the constructor by browsing the code.

My advice if you want to learn the code is by starting with the unit tests.
Focus on the smaller pieces and work your way up.

Cheers,

Ivan
On Oct 5, 2014 8:44 AM, ahmed jamal maaz jamalahmedm...@gmail.com wrote:

Hi,

I am a newbie to elasticsearch and lucene. I am going through its source
code. I like the general idea of search on Lucene.

Current State:
I have touch the bottom of insertion (creating new index) and general
querying.

Thanks in advance.

Obliged
Jamal

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/834b9e4d-5564-4471-9564-33b81d9f2935%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/834b9e4d-5564-4471-9564-33b81d9f2935%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQCJJSispLonWUNS1zjLVFmJ5UNKNpN8f1Td3G93EYDs5w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: recent elasticsearch vs solrcloud comparison ?

2014-10-03 Thread Ivan Brusic

I have not looked at that video, but most comparisons are with Solr and not
SolrCloud.

--
Ivan
On Oct 3, 2014 12:37 PM, Gaurav gupta gupta.gaurav0...@gmail.com wrote:

Kevin,
I found the recent comparision from the search experts @
http://berlinbuzzwords.de/session/side-side-elasticsearch-and-solr
It appears that ES is now a prefered choice due to better distributed
support.

cheers,
Gaurav

On Fri, Oct 3, 2014 at 8:16 AM, Kevin Burton burtona...@gmail.com wrote:

It seems that both elastic search and solr cloud are in a bit of an
apples and oranges situation right now. Neither one seems to be clearly
superior and one might be more suited for one role vs another.

But I can't find a very good comparison of both engines.

IS there anything recent that anyone can recommend? Would be good to see
the pros and cons of both solutions.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/698d9605-a1af-44b7-9fde-40afa1b3b954%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/698d9605-a1af-44b7-9fde-40afa1b3b954%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALZAj3%2Bb4%3DoZzxS_HUV5uCeNpZiF7R1rBowh7sM-QzFCR0AfiQ%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CALZAj3%2Bb4%3DoZzxS_HUV5uCeNpZiF7R1rBowh7sM-QzFCR0AfiQ%40mail.gmail.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQAnGfg6VNqQqJXQErMyqvxUGhf_tKCLqpU74kztsqGc1g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Upgrading from very old version of ES with zero down time

2014-10-02 Thread Ivan Brusic

Your indices should be fine as is. Lucene is guaranteed to be able to read
data from 1 major revision prior. Elasticsearch 0.20 is Lucene 3 and the
latest Elasticsearch is Lucene 4. Because of various bugs at the Lucene
level, you should run an optimize (normally discouraged) to upgrade the
indices within Lucene.

Can you hold off indexing new content during the transition? If you want
zero downtime, it is essential. Also, what type of client are you using?
Although the Java transport client is the most flexible, it can be an issue
for zero downtime upgrades (with versions prior to 1.0).

If you can wait a little longer, I would hold off for Elasticsearch 1.4,
which is currently in beta. I think it would be the biggest release since
1.0.

Cheers,

Ivan

On Thu, Oct 2, 2014 at 7:57 AM, joergpra...@gmail.com joergpra...@gmail.com
wrote:

FYI I have prepared Knapsack plugin for Elasticsearch 0.20.6

https://github.com/jprante/elasticsearch-knapsack

Source code tag:

https://github.com/jprante/elasticsearch-knapsack/releases/tag/0.20.6.2

Link of Plugin ZIP for 0.20.6.2:

http://xbib.org/repository/org/xbib/elasticsearch/plugin/elasticsearch-knapsack/0.20.6.2/elasticsearch-knapsack-0.20.6.2-plugin.zip

Jörg

On Thu, Oct 2, 2014 at 12:42 PM, joergpra...@gmail.com
joergpra...@gmail.com wrote:

If you wish, I can prepare a knapsack backport for 0.20, then you can
dump all your data into an archive file, and reimport the archive into a
higher version.

Jörg

On Thu, Oct 2, 2014 at 2:28 AM, Eugene Strokin eug...@strokin.info
wrote:

Hello,
my ES cluster is still running version 0.20.1. It is time to upgrade. I
know I cannot just use indexes as is and replace the jars by the newest ES.
They are not compatible as far as I understood.
So I need to set up a parallel cluster with the newest ES and some how
transfer all the data with zero down time. The size of the indexes is about
100Gb and the traffic is relatively big, so it could take some time, and
somehow I need to keep the clusters in sync.
Did someone had such experience?
Does someone have any suggestions how to approach this?
I cannot come up with some elegant solution.
Any help is greatly appreciated.
Thank you,
Eugene

--
You received this message because you are subscribed to the Google
Groups elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/187ebef9-d1ce-436f-8986-c355c853b622%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/187ebef9-d1ce-436f-8986-c355c853b622%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoE3-htFL9XGMnOpV2WMZq9fWhj%2B9QjzJOjZty64Hdh0Ww%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoE3-htFL9XGMnOpV2WMZq9fWhj%2B9QjzJOjZty64Hdh0Ww%40mail.gmail.com?utm_medium=emailutm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQAOXXGjQ%3D%3DKLrsjxY41o-G9NEDBNPfzVw7T9GR_srm5aQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Error starting up Elasticsearch

2014-09-29 Thread Ivan Brusic

Seems like a version mismatch. What versions of elasticsearch/logstash are
you using? Are you using the 'elasticsearch' output in logstash or
'elasticsearch_http'? Try using the latter.

-- 
Ivan

On Mon, Sep 29, 2014 at 1:24 PM, larrychu...@gmail.com wrote:

 I get this in the logs when starting up elasticsearch and looking for some
 guidance:

 I can see logstash sending over the logs but elasticsearch isn't creating
 indexes and nothing shows in kibana.

 Grateful if anyone could get me pointed in right direction.

 [2014-09-29 09:48:23,121][WARN ][transport.netty  ] [mynode]
 exception caught on transport layer [[id: 0x0a12176c, /192.x.x.x:20035 :
 /192.x.x.x:9300]], closing connection
 java.io.StreamCorruptedException: invalid internal transport message format
 at
 org.elasticsearch.transport.netty.SizeHeaderFrameDecoder.decode(SizeHeaderFrameDecoder.java:46)
 at
 org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:425)
 at
 org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.cleanup(FrameDecoder.java:482)
 at
 org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.channelDisconnected(FrameDecoder.java:365)
 at
 org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:102)
 at
 org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
 at
 org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
 at
 org.elasticsearch.common.netty.OpenChannelsHandler.handleUpstream(OpenChannelsHandler.java:74)
 at
 org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
 at
 org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
 at
 org.elasticsearch.common.netty.channel.Channels.fireChannelDisconnected(Channels.java:396)
 at
 org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.close(AbstractNioWorker.java:360)
 at
 org.elasticsearch.common.netty.channel.socket.nio.NioServerSocketPipelineSink.handleAcceptedSocket(NioServerSocketPipelineSink.java:81)
 at
 org.elasticsearch.common.netty.channel.socket.nio.NioServerSocketPipelineSink.eventSunk(NioServerSocketPipelineSink.java:36)
 at
 org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendDownstream(DefaultChannelPipeline.java:574)
 at
 org.elasticsearch.common.netty.channel.Channels.close(Channels.java:812)
 at
 org.elasticsearch.common.netty.channel.AbstractChannel.close(AbstractChannel.java:197)
 at
 org.elasticsearch.transport.netty.NettyTransport.exceptionCaught(NettyTransport.java:523)
 at
 org.elasticsearch.transport.netty.MessageChannelHandler.exceptionCaught(MessageChannelHandler.java:229)
 at
 org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:112)
 at
 org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
 at
 org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
 at
 org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.exceptionCaught(FrameDecoder.java:377)
 at
 org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:112)
 at
 org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
 at
 org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
 at
 org.elasticsearch.common.netty.OpenChannelsHandler.handleUpstream(OpenChannelsHandler.java:74)
 at
 org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
 at
 org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
 at
 org.elasticsearch.common.netty.channel.Channels.fireExceptionCaught(Channels.java:525)
 at
 org.elasticsearch.common.netty.channel.AbstractChannelSink.exceptionCaught(AbstractChannelSink.java:48)
 at
 org.elasticsearch.common.netty.channel.DefaultChannelPipeline.notifyHandlerException(DefaultChannelPipeline.java:658)
 at
 org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:566)
 at
 org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
 at

Re: issue with elastic search TransportClient of java API

2014-09-26 Thread Ivan Brusic

In general, newer client libraries should not be used with older clusters.
Most of the version checking happens on the server side and the older code
does not know about the newer client.

-- 
Ivan

On Fri, Sep 26, 2014 at 9:54 AM, David Pilato da...@pilato.fr wrote:

 I have no idea. Could be an issue.

 Any chance you could create a small test project which reproduce it and
 share it on github?


 --
 *David Pilato* | Technical Advocate | *elasticsearch.com
 http://elasticsearch.com*
 david.pil...@elasticsearch.com
 @dadoonet https://twitter.com/dadoonet | @elasticsearchfr
 https://twitter.com/elasticsearchfr | @scrutmydocs
 http://twitter.com/scrutmydocs
 https://twitter.com/scrutmydocs



 Le 26 septembre 2014 à 09:40:15, Vijay Tiwary (vijaykr.tiw...@gmail.com)
 a écrit:

 Hi David,

 I have identified the problem. Actually the transport client that i was
 creating was like this:

 Settings settings =
 ImmutableSettings.settingsBuilder().put(client.transport.sniff,
 true).build();
 client = new TransportClient(settings).addTransportAddress(new
 InetSocketTransportAddress(localhost, 9300));

 However if turn off the sniff portion then its working fine.

 I am testing in a single node cluster so why does setting *sniff*
 property to true is causing the problem?


 On Friday, September 26, 2014 12:33:46 PM UTC+5:30, Vijay Tiwary wrote:

 I am using elastic search 1.2.1 and java client for the same is 1.3.2

 On Friday, September 26, 2014 12:24:24 PM UTC+5:30, David Pilato wrote:

  Just checking. Which version you elasticsearch cluster is?


 --
 David ;-)
 Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs


 Le 26 sept. 2014 à 08:28, Vijay Tiwary vijaykr...@gmail.com a écrit :

  Blow is the junit test class:

  public class BenchMarkES {
 private static final Logger LOG = LoggerFactory.getLogger(
 BenchMarkES.class);
 private static TransportClient client = null;
 @Before
 public void setUp() {
 Settings settings = ImmutableSettings.settingsBuilder().build();
 client = new TransportClient(settings).addTransportAddress(new
 InetSocketTransportAddress(localhost, 9300));
 }
 @Test
 public void doNormalQuery(){
 try{
 int queryExecutionCount = 5;
 SearchResponse sr = null;
 FilterBuilder filter = FilterBuilders.termFilter(
 brand_context_id,5);
 long start = System.currentTimeMillis();
 for (int i = 0; i  queryExecutionCount; i++) {
 sr = launchSearch(filter, 2000);
 }
 long end = System.currentTimeMillis();
 LOG.info(Time taken for the normal quelry +(end - start)+ ms);
 SearchHits sh = sr.getHits();
 SearchHit[] searchHit = sh.getHits();
 LOG.info(Hits :+sh.getTotalHits()+, Docs fetched :+searchHit.length);
 /*for (SearchHit doc : searchHit) {
 LOG.info(Document :+doc.getSource().get(tweet_id));
 }*/
 }catch (Exception e) {
 LOG.error(e.getMessage(), e);
 }
 }
 private SearchResponse launchSearch(FilterBuilder filter, int size)
 throws IOException {

 FilteredQueryBuilder fqb = new 
 FilteredQueryBuilder(QueryBuilders.matchAllQuery(),
 filter);
 SearchRequestBuilder srb = client.prepareSearch(twitter
 ).setTypes(tweet).setQuery(fqb);
 if(aggregation != null){

 srb.addAggregation(aggregation);
 }
 srb.setFrom(0).setSize(size);
 SearchResponse response = srb.execute().actionGet();

return response;
}

 }

 So the problem is this if i execute the block
 * for (int i = 0; i  queryExecutionCount; i++) {*
 *sr = launchSearch(filter, 2000);*
 *}*

 with *queryExecutionCount* set to 1 it works however if I set to any
 value greater than 1 it fails


 On Friday, September 26, 2014 11:49:07 AM UTC+5:30, David Pilato wrote:

  How your Java code looks like?
 What was your curl query?

 --
 David ;-)
 Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs


 Le 26 sept. 2014 à 07:34, Vijay Tiwary vijaykr...@gmail.com a écrit :


 I am having a  instance of TransportClient  which is singleton in my
 web application. In one of the flow I have to query elastic search twice
 one after the other. However, first call to elastic search cluster is
 working and the other one is failing with the following exception:

  No valid missing index type id: 38
  org.elasticsearch.ElasticsearchIllegalArgumentException: No valid
 missing index type id: 38
 at 
 org.elasticsearch.action.support.IndicesOptions.readIndicesOptions(IndicesOptions.java:111)
 ~[elasticsearch-1.3.2.jar:na]
 at 
 org.elasticsearch.action.search.SearchRequest.readFrom(SearchRequest.java:505)
 ~[elasticsearch-1.3.2.jar:na]
 at org.elasticsearch.transport.netty.MessageChannelHandler.
 handleRequest(MessageChannelHandler.java:209)
 ~[elasticsearch-1.3.2.jar:na]
 at org.elasticsearch.transport.netty.MessageChannelHandler.
 messageReceived(MessageChannelHandler.java:109)
 ~[elasticsearch-1.3.2.jar:na]
 at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.
 handleUpstream(SimpleChannelUpstreamHandler.java:70)
 ~[elasticsearch-1.3.2.jar:na]
 at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.

Re: Unnecessary Cache Eviction Explained

2014-09-23 Thread Ivan Brusic

Otis, from what I understand, the default size for the cache is unbounded,
so cache eviction should not occur due to inconsistent range checks in the
default case.

--
Ivan

On Mon, Sep 22, 2014 at 9:27 PM, Otis Gospodnetic
otis.gospodne...@gmail.com wrote:

Hi,

It sounds like every single ES deployment out there suffers from this, or
am I missing something? Is there an ES issue where this could be tracked
(even if the problem in in Guava)?

Thanks,
Otis
--
Monitoring * Alerting * Anomaly Detection * Centralized Log Management
Solr Elasticsearch Support * http://sematext.com/

On Monday, September 22, 2014 5:27:56 PM UTC-4, Craig Wittenberg wrote:

You can find my proposed changes at https://code.google.com/r/
craigwi-guava/. Comments welcome.

I'm having trouble compiling the tests and so haven't run them yet.

Craig.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/b941d15b-1904-40b1-883b-063995e7bbfb%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/b941d15b-1904-40b1-883b-063995e7bbfb%40googlegroups.com?utm_medium=emailutm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQCbmnsUXY5HMkqzMBw_fC20EnQwe9qgftROvHsN%2B881GA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Problem with word-separators in bool search with standard tokenizer

2014-09-22 Thread Ivan Brusic

The query string query is working because the ampersand is also being
stripped from the query.

Your best bet is to use the pattern tokenizer and explicitly define which
characters to split the input text on.

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-pattern-tokenizer.html

Cheers,

Ivan

On Mon, Sep 22, 2014 at 9:19 AM, Ankush Jhalani ankush.jhal...@gmail.com
wrote:

just checking back if anyone has any ideas.. thanks!

On Friday, September 19, 2014 11:05:59 AM UTC-4, Ankush Jhalani wrote:

In our search we have configured text with 2 analyzers, english and
standard so we can match phrases on the standard-analyzer. We break the
keywords by space, and create a bool query for each word.

This is working fine for all cases except where the query has standard
word-separators like (ampersand), ; (semi-colon), etc. As
word-separators are stripped in index by analyzer, searching for them
returns 0 results. Gist. https://gist.github.com/
ajhalani/3def3ea7caec5cd58490

I don't want to use a whitespace analyzer because we do actually want to
ignore word separators. I was thinking about hacky workarounds like
removing all standalone non-alphanumeric characters, or moving them in
should instead of default must (in case we do have analyzers in future
that are whitespace).

Thanks in advance.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/e7dfb594-58c1-4127-8ae7-73f2c1f0adca%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/e7dfb594-58c1-4127-8ae7-73f2c1f0adca%40googlegroups.com?utm_medium=emailutm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQCqrVg8kWgArY_t5paHSCeEG9LWAdv_0Q2rm9vdcnPqeQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Custom Collector using a plugin

2014-09-19 Thread Ivan Brusic

You basically want to create your own aggregation, which are basically
collectors at the Lucene level. Look at existing plugins which provide
custom aggregation.

Basically, elasticsearch uses a scatter-gather/map-reduce model for
distributed collections.

--
Ivan
On Sep 18, 2014 12:56 AM, tim glabisch frk...@googlemail.com wrote:

Hello,

i am just looking for an entry point for a custom (lucene) collector.
is it possible to use a custom collector at all?

what classes do i have to implement to run the collector in a distributed
way?

thanks a lot,
tim

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/91c16446-0e5e-4f20-99a9-83a49a46b0bd%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/91c16446-0e5e-4f20-99a9-83a49a46b0bd%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQCYx%2BFxN93GJxdd1x_8g-Ftk5qd2sZPRjRbe%3D5%3DWe%3DC9A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Boosting a type

2014-09-17 Thread Ivan Brusic

I have yet to switch over to groovy, so I can't comment on where your
current script is wrong (it looks good to me as well). However, you can use
the standard function score, which are easier to understand and do not
rely on scripting (technically better performance).

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-function-score-query.html#_using_function_score

Something like:

function_score: {
  functions: [
{
  boost_factor: 2,
  filter: {
terms: {
  _type: [
my_type2
  ]
}
  }
}
  }
}

-- 
Ivan

On Wed, Sep 17, 2014 at 7:56 AM, Ramy remra...@gmail.com wrote:

 i have solved it in that way...

 GET /my_index/my_type1,my_type2/_search
 {
   query: {
 function_score: {
   query: {
 match: {
   value.autocomplete: lorem ipsum
 }
   },
   functions: [
 {
   script_score: {
 lang: groovy,
 script: doc['_type'].value == 'my_type2' ? _score * 2 :
 _score
   }
 }
   ]
 }
   }
 }

 cheers

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/b56befbd-69a7-47b8-9918-29cdcaa6a062%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/b56befbd-69a7-47b8-9918-29cdcaa6a062%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQBky5oo_PoEoKw%3D3J_ih_TrsFpq7E4APgFjjzFTu_V0CQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Linking of query/search

2014-09-12 Thread Ivan Brusic

You cannot join documents in Lucene/Elasticsearch (at least not like a
RDBMS). You would need to either denormalize your data, join on the client
side or execute 2+ queries.

--
Ivan

On Fri, Sep 12, 2014 at 12:45 AM, matej.zerov...@gmail.com wrote:

Hello!

Can anyone shine some light on my question?
Is the query in question achievable in ES directly?

If not, I can probably do that in application later, but it would be nicer
if ES could serve me the final results.

Matej

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/6f3345f2-4b25-4b06-b203-4ad0de201e8f%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/6f3345f2-4b25-4b06-b203-4ad0de201e8f%40googlegroups.com?utm_medium=emailutm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQBgybZpCz1bKV%3DE7XF_cHGDuFKS1wruKNAYZTbo8t0jvA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Do I need the JDBC driver

2014-09-12 Thread Ivan Brusic

I would strongly prefer to maintain control of the indexing side and not in
Elasticsearch. In fact, the Elasticsearch team has talked about deprecating
river plugins. I do not have any numbers, but I would suspect that the
majority of users do not use a river plugin. And yes, the correct term is
the JDBC plugin, not driver. The wrong term confused many. :)

--
Ivan

On Fri, Sep 12, 2014 at 3:24 AM, joergpra...@gmail.com
joergpra...@gmail.com wrote:

You can use either style, it is a matter of taste, or convenience.

With the JDBC plugin, you can also push data instead of pull.

Jörg

On Fri, Sep 12, 2014 at 12:11 PM, James m...@employ.com wrote:

I want to close this issue but I still do not understand if I should be
pushing documents from my database using the PHP client or using the JDBC
river to pull them into elasticsearch from the SQL database.

They can both achieve the same thing, but what is the usecase which
defines when is the right time to use each implementation.

On Wednesday, September 10, 2014 10:59:18 AM UTC+1, James wrote:

Hi,

I'm setting up a system where I have a main SQL database which is synced
with elasticsearch. My plan is to use the main PHP library for
elasticsearch.

I was going to have a cron run every thirty minuets to check for items
in my database that not only have an active flag but that also do not
have an indexed flag, that means I need to add them to the index. Then I
was going to add that item to the index. Since I am using taking this path,
it doesn't seem like I need the JDBC driver, as I can add items to
elasticsearch using the PHP library.

So, my question is, can I get away without using the JDBC driver?

James

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/6c244e00-1f89-447d-8eb5-114f0b5efcbd%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/6c244e00-1f89-447d-8eb5-114f0b5efcbd%40googlegroups.com?utm_medium=emailutm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHT2DcMHJwMjxBZ0RsV4_eKJyB2KjBCiqB2ZTac8fzkTg%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHT2DcMHJwMjxBZ0RsV4_eKJyB2KjBCiqB2ZTac8fzkTg%40mail.gmail.com?utm_medium=emailutm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQBvDga8q--Au8yWaX2RMGgcDTpYMhLu243tB9w7z0W0_A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Do I need the JDBC driver

2014-09-12 Thread Ivan Brusic

Elasticsearch is no different than any other data store: your application
can add data by using the prescribed methods. Every data store has some
sort of data input method. Elasticsearch allows river plugins, which mean
that the Elasticsearch process can pull data instead of the standard push
model. The pull model is usually employed when two data sources should be
in sync (CouchDB, RDBMS).

I would stick to the standard push model. Have your client application
index data via the PHP library.

Cheers,

Ivan

On Fri, Sep 12, 2014 at 10:54 AM, Employ m...@employ.com wrote:

I must admit I'm new to this so I find some of the information hard to
understand. So sorry if I am asking stupid questions.

On 12 Sep 2014, at 18:26, Ivan Brusic i...@brusic.com wrote:

I would strongly prefer to maintain control of the indexing side and not
in Elasticsearch. In fact, the Elasticsearch team has talked about
deprecating river plugins. I do not have any numbers, but I would suspect
that the majority of users do not use a river plugin. And yes, the correct
term is the JDBC plugin, not driver. The wrong term confused many. :)

--
Ivan

On Fri, Sep 12, 2014 at 3:24 AM, joergpra...@gmail.com
joergpra...@gmail.com wrote:

You can use either style, it is a matter of taste, or convenience.

With the JDBC plugin, you can also push data instead of pull.

Jörg

On Fri, Sep 12, 2014 at 12:11 PM, James m...@employ.com wrote:

They can both achieve the same thing, but what is the usecase which
defines when is the right time to use each implementation.

On Wednesday, September 10, 2014 10:59:18 AM UTC+1, James wrote:

Hi,

I'm setting up a system where I have a main SQL database which is
synced with elasticsearch. My plan is to use the main PHP library for
elasticsearch.

So, my question is, can I get away without using the JDBC driver?

James

--
You received this message because you are subscribed to the Google
Groups elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/6c244e00-1f89-447d-8eb5-114f0b5efcbd%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/6c244e00-1f89-447d-8eb5-114f0b5efcbd%40googlegroups.com?utm_medium=emailutm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHT2DcMHJwMjxBZ0RsV4_eKJyB2KjBCiqB2ZTac8fzkTg%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHT2DcMHJwMjxBZ0RsV4_eKJyB2KjBCiqB2ZTac8fzkTg%40mail.gmail.com?utm_medium=emailutm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the
Google Groups elasticsearch group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/0dzSMbARlks/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQBvDga8q--Au8yWaX2RMGgcDTpYMhLu243tB9w7z0W0_A%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQBvDga8q--Au8yWaX2RMGgcDTpYMhLu243tB9w7z0W0_A%40mail.gmail.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/9EBFE31B-8DEA-4E05-8342-9E0013BC450B%40employ.com
https://groups.google.com/d/msgid/elasticsearch/9EBFE31B-8DEA-4E05-8342-9E0013BC450B%40employ.com?utm_medium=emailutm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received

Re: Elasticsearch 1.4.0 release data?

2014-09-10 Thread Ivan Brusic

I think this release might be their biggest one since 1.0. Lots of big
changes including a change in the consensus algorithm. It might take time,
but that is only a guess.

--
Ivan

On Wed, Sep 10, 2014 at 2:57 AM, joergpra...@gmail.com
joergpra...@gmail.com wrote:

I use the Github issue tracker to watch the progress of the fabulous ES
dev team

https://github.com/elasticsearch/elasticsearch/labels/v1.4.0

Today: 20 issues left, 4 blockers. Looks like it will still take some days.

Jörg

On Wed, Sep 10, 2014 at 11:39 AM, Dan Tuffery dan.tuff...@gmail.com
wrote:

Is there are release date scheduled for ES 1.4.0? I need the child
aggregation for the project I'm working on at the moment.

https://github.com/elasticsearch/elasticsearch/pull/6936

Dan

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/0238c4fd-a702-4fca-8bcc-3dab6d71bc6f%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/0238c4fd-a702-4fca-8bcc-3dab6d71bc6f%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGP%2Bq64F5FVAfjym9SvO6RM5dHOzuJMe7L8xFL4ekut%3Dg%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGP%2Bq64F5FVAfjym9SvO6RM5dHOzuJMe7L8xFL4ekut%3Dg%40mail.gmail.com?utm_medium=emailutm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQBd2uv%2BkfW4JsmFT%2BjoR3w%2BHr1_RZ4s_Bvh1a5ABzjS5g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Aggregation framework, Java API

2014-09-09 Thread Ivan Brusic

A filtered query with no explicit query will ultimately be translated into
a match-all/constant-score query at the Lucene level. I prefer to
explicitly define all my match all queries and use the specific post filter
name, and not the old filter name, which was deprecated due to its
ambiguity.

Besides, even if you did not have aggregations, you want to do as much
pre-filtered as you can. Post filters work on documents that have been
scored. No need to score documents that will eventually be filtered. Post
filters have some benefits, but it seems like they do not apply in this
case.

--
Ivan

On Tue, Sep 9, 2014 at 2:26 AM, Emanuel Buzek emanuel.bu...@roke.cz wrote:

Thanks Ivan.

Yes, it was the post filter which was ignored. We use filtered query only
when the user sends a query string, otherwise (when only exact filters for
specific columns are specified) we use the post filter. It seems strange to
me to use the FilteredQuery when the query string is empty, but perhaps
that would be the most straight forward way of doing this.

thank you,
emanuel

Dne pondělí, 8. září 2014 17:21:21 UTC+2 Ivan Brusic napsal(a):

Which filter was ignored? I am assuming you meant the post filter (which
might be still called filter at the Java API), which in this case the
filter is bypassed by design. Post filters allow you to filter the
documents returned, but leave the aggregations as is. Sounds like you are
looking for filtered queries. The method name is ambiguous, which is why it
has been renamed (and should actually be deprecated in the API).

Best way to learn the Java API is via the unit tests, but I do agree,
there is no clean way to write elegant code due to explicit casting.

https://github.com/elasticsearch/elasticsearch/
tree/master/src/test/java/org/elasticsearch/search/aggregations

Cheers,

Ivan

On Mon, Sep 8, 2014 at 5:41 AM, mooky nick.mi...@gmail.com wrote:

The aggregation takes into account a query - but not a post-filter. I'm
not sure of the rationale behind the difference.

The java api for traversing results is quite painful - but I think a
good part of that is due to Java the fact that there is very little
polymorphic behaviour between aggregation results (some have single
results, others have buckets, some have sub-aggregations, some dont).
The only alternative that I can think of is a completely type-less
navigation of the data - which does little more than navigate the JSON
document.

Hope that helps a bit.

On Monday, 8 September 2014 10:26:44 UTC+1, Emanuel Buzek wrote:

Hi there,
I just used the elasticsearch aggregations through the Java API for the
first time.

All I wanted was a simple min/max/sum/avg, so I used the Stats
aggregation. However, I was very surprised that the filter in the
SearchRequestBuilder is ignored, so I had to wrap the Stats Aggregation
into FilterAggregation.

Getting the aggregation result seems a bit tedious:

InternalStats stats = (InternalStats)((InternalFilter)a).
getAggregations().asList().get(0);

Maybe I am using the Java API wrong (I hope I am, otherwise it's imho
poorly designed.) Can anyone point me to an example how to access the
aggregation results from Java better?

Also, I think that the aggregation should be filtered by default. If I
specify the filter with a query or post filter:

queryBuilder = QueryBuilders.filteredQuery(queryBuilder,
filterBuilder);

searchRequestBuilder.setQuery(queryBuilder);

and then add an aggregation GET to the same searchRequestBuilder, I
think it's very unintuitive if the aggregation is computed globally. Anyone
has this feeling as well?

thanks, emanuel

--
Emanuel Buzek
Software Engineer, ROKE.cz http://www.roke.cz
tel: +420 776 54 26 26

--
You received this message because you are subscribed to the Google
Groups elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/a2e3a1e1-912a-4257-b938-6036d0c9d3ff%
40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/a2e3a1e1-912a-4257-b938-6036d0c9d3ff%40googlegroups.com?utm_medium=emailutm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/1eb0e2c2-91e9-43b0-b076-decd33fa6440%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/1eb0e2c2-91e9-43b0-b076-decd33fa6440%40googlegroups.com?utm_medium=emailutm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups

Re: Faster sloppy phrase queries

2014-09-09 Thread Ivan Brusic

Hopefully Mike McCandless will get some of the new Lucene features into
Elasticsearch:

http://blog.mikemccandless.com/2014/08/a-new-proximity-query-for-lucene-using.html

I suspect it will come soon.

--
Ivan

On Mon, Sep 8, 2014 at 2:11 PM, Nikolas Everett nik9...@gmail.com wrote:

On Mon, Sep 8, 2014 at 5:08 PM, joergpra...@gmail.com
joergpra...@gmail.com wrote:

Is shingling for proximity boosting on multi term phrases an alternative,
like in
http://www.romseysoftware.co.uk/2012/09/27/proximity-boosting-in-elasticsearch/
?

I'm not sure if it'll be good enough though - because its kind of like 0
slop and we're using 1 slop now. I can certainly try playing with it
though.

Nik

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAPmjWd3gVnFu%3DDGtr6u_Y4tRSu5c5T_03zdbVMzKJE0eSj0h0w%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAPmjWd3gVnFu%3DDGtr6u_Y4tRSu5c5T_03zdbVMzKJE0eSj0h0w%40mail.gmail.com?utm_medium=emailutm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQBd4nDHzXRiUaE313NxyPv5oQJHjpLZV7%3DNwa5pDvke0w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: elasticsearch Java API for function_score query

2014-09-09 Thread Ivan Brusic

Malini, I would suggest starting a new thread instead of adding to an old
one.

I find the Java API for the boost functions to be confusing, or at least,
not as clean as the rest of the Java API. I wonder if the Elasticsearch
team would accept a PR. Jörg's example above could be used as a skeleton
for your code. Something like

new FunctionScoreQueryBuilder(existingFilteredQuery)
.add(termsFilter(abbrev, computer), factorFunction(-10f))

-- 
Ivan


On Tue, Sep 9, 2014 at 4:28 PM, Malini malini.ramapra...@gmail.com wrote:

 How do I implement the following query using Java ApI? Thanks!

 curl -XGET http://localhost:9200/cs/csdl/_search?pretty=true -d '
 {
 query:{
 function_score: {
 functions: [
 {
 boost_factor: -10,
 filter: {
  terms : {abbrev : [computer] }
 }
 }
 ],
 query: {
   filtered: {
 query : {
 multi_match : {
 fields : [title],
 query : [computer]

 }
 },
 filter: {
   bool: {
 must: { range: {
  pubdate: {
 gte: 1890-09 ,
 lte:2014-08
   }
}
  },

  must : {
 terms : {
abbrev : [computer,annals,software]
  }
 }
   }
 }
   }
  }
 }
 }
 }'



 On Tuesday, June 10, 2014 1:39:57 PM UTC-7, Jörg Prante wrote:

 Try this

 import org.elasticsearch.action.search.SearchRequest;
 import org.elasticsearch.index.query.functionscore.
 FunctionScoreQueryBuilder;

 import java.util.Arrays;

 import static org.elasticsearch.client.Requests.searchRequest;
 import static org.elasticsearch.index.query.FilterBuilders.termsFilter;
 import static org.elasticsearch.index.query.QueryBuilders.matchQuery;
 import static org.elasticsearch.index.query.functionscore.
 ScoreFunctionBuilders.factorFunction;
 import static org.elasticsearch.search.builder.SearchSourceBuilder.
 searchSource;

 public class FunctionScoreTest {

 public void testFunctionScore() {
 SearchRequest searchRequest = searchRequest()
 .source(searchSource().query(new
 FunctionScoreQueryBuilder(matchQuery(party_id, 12))
 .add(termsFilter(course_cd,
 Arrays.asList(writ100, writ112, writ113)), factorFunction(3.0f;
 }
 }

 Jörg


 On Tue, Jun 10, 2014 at 11:16 AM, Jayanth Inakollu ibsjaya...@gmail.com
 wrote:

 I need to implement the below function_score query using Java APIs. I
 couldn't find any official documentation for function_score query in the
 Java API section of elasticsearch

 function_score: {
 functions: [
 {
 boost_factor: 3,
 filter: {
  terms : {course_cd : [writ100, writ112,
 writ113] }
 }
 }
 ],
 query: {
   match : {
party_id : 12
   }
  }
 }

 Please help!

 --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/56d92aab-a4d7-4757-9441-f248c5296b3c%
 40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/56d92aab-a4d7-4757-9441-f248c5296b3c%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/311c8492-8e78-4188-847c-44d7d115b464%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/311c8492-8e78-4188-847c-44d7d115b464%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQCzs6fjav3SMMaf8vT79kMEccyC3-HWwK_wCjC8mYXQBw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Aggregation framework, Java API

2014-09-08 Thread Ivan Brusic

Best way to learn the Java API is via the unit tests, but I do agree, there
is no clean way to write elegant code due to explicit casting.

https://github.com/elasticsearch/elasticsearch/tree/master/src/test/java/org/elasticsearch/search/aggregations

Cheers,

Ivan

On Mon, Sep 8, 2014 at 5:41 AM, mooky nick.minute...@gmail.com wrote:

The aggregation takes into account a query - but not a post-filter. I'm
not sure of the rationale behind the difference.

The java api for traversing results is quite painful - but I think a good
part of that is due to Java the fact that there is very little
polymorphic behaviour between aggregation results (some have single
results, others have buckets, some have sub-aggregations, some dont).
The only alternative that I can think of is a completely type-less
navigation of the data - which does little more than navigate the JSON
document.

Hope that helps a bit.

On Monday, 8 September 2014 10:26:44 UTC+1, Emanuel Buzek wrote:

Hi there,
I just used the elasticsearch aggregations through the Java API for the
first time.

Getting the aggregation result seems a bit tedious:

InternalStats stats = (InternalStats)((InternalFilter)a).
getAggregations().asList().get(0);

Maybe I am using the Java API wrong (I hope I am, otherwise it's imho
poorly designed.) Can anyone point me to an example how to access the
aggregation results from Java better?

Also, I think that the aggregation should be filtered by default. If I
specify the filter with a query or post filter:

queryBuilder = QueryBuilders.filteredQuery(queryBuilder, filterBuilder);

searchRequestBuilder.setQuery(queryBuilder);

and then add an aggregation GET to the same searchRequestBuilder, I think
it's very unintuitive if the aggregation is computed globally. Anyone has
this feeling as well?

thanks, emanuel

--
Emanuel Buzek
Software Engineer, ROKE.cz http://www.roke.cz
tel: +420 776 54 26 26

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/a2e3a1e1-912a-4257-b938-6036d0c9d3ff%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/a2e3a1e1-912a-4257-b938-6036d0c9d3ff%40googlegroups.com?utm_medium=emailutm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQA_Ca%2B-dZOOcAxX97pDEmgoS53wLaoBtEiHb2xFHqMxnA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: should ES_HEAP_SIZE be less than 31G?

2014-09-04 Thread Ivan Brusic

On Wed, Sep 3, 2014 at 11:47 AM, joergpra...@gmail.com 
joergpra...@gmail.com wrote:


 ES scales best over multiple machines horizontally, not vertically. More
 RAM does not automatically mean better performance at linear scale at a
 certain point - it depends on the JVM if it can keep up.


Ouch, what a brainfart I had! Yes, I meant to say horizontally as well. If
you shard your indices correctly, or use time-based indices, you can always
increase capacity at a later point. The shard allocator does not take into
consideration the difference in computing power between machines, so it is
best to have all nodes be relatively the same (unless you do manual
allocation). Most clustered software such as Hadoop work in such as fashion.

-- 
Ivan

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQDfLn_Zc-VJ4fakTJNS4%2Bur9k9K4ymeJC8CWjZ1NrHbvg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Is it possible to add yet another score value based on similarity (same words) to differentiate between two _scores ?

2014-09-04 Thread Ivan Brusic

Can you simply boost the non analyzed field? If the scores are still too
similar, try using a dis_max query with the non analyzed query getting a
higher boost:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-dis-max-query.html

--
Ivan

On Wed, Sep 3, 2014 at 7:16 AM, Pontus Lundin lundin.codei...@gmail.com
wrote:

Hi!

I have created an multi_field index with one field analyzed
(with edgengram, min:3, max:15) and the other one not. Then i am doing a
multi match on this and get relevant hits.I am doing this to find exact
matches which seems to work.

So far so good, however how do i separate hits that are really relevant
(i.e the words are equal but might be in another order etc) to my search
string and false-positive results from the ngram which can have very
different meaning.

An example would be:

*query*:Crankshaft position sensor
*hits*:Position Sensor, Crankshaft

This is a very good and similar results and the score is equal to max
score.

However i can not determinate and draw any conclusion rom oly compare the
score value becuase another example could yield the same score but should
not rank as hight becuase the meaning is different.

*query*:Motoroil
*hit*:Motorblock

This is not relevant but ofcourse originates from the ngram. The hit
score is equal to Max score.
Of course i could increate the min and max on the ngram but it
seems usefull for other cases so not really an option.

Thanks!

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/f821aa7e-f666-430f-b4cb-7ed1796c0722%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/f821aa7e-f666-430f-b4cb-7ed1796c0722%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQCdE_%2B8qMy1fE8z9vPv4M3Kc5ZrunMtDdfGuGVwCJnr3g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: How do I remove _index, _type, _id and _score from output?

2014-09-04 Thread Ivan Brusic

There is a plugin which can help:
https://github.com/jprante/elasticsearch-index-termlist

--
Ivan

On Wed, Sep 3, 2014 at 11:47 AM, David Pilato da...@pilato.fr wrote:

I don't think you can as far as I remember the same thread about it some
months ago.

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 3 sept. 2014 à 20:40, Gerald DeConto geralddeco...@gmail.com a
écrit :

we process millions of search queries a day and are looking at adding
elasticsearch to our toolbox

However, I haven't found a way to turn off the following redundant fields
from the output:

_index
_type
_id
_score

is there a way to only return them once in the results or to turn them off
completely?

it doesn't make sense to have these fields returned for every document we
get back unless we really want them returned. It bloats the response and
adds network load

any help appreciated

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/43ecc475-2e61-4071-a1eb-710b7e4cd092%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/43ecc475-2e61-4071-a1eb-710b7e4cd092%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/E8C1AF2C-BEDB-4249-88A3-DE05A45CE9C5%40pilato.fr
https://groups.google.com/d/msgid/elasticsearch/E8C1AF2C-BEDB-4249-88A3-DE05A45CE9C5%40pilato.fr?utm_medium=emailutm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQA16t8q2e014nvZofguYL-BseNozNQjeTf%2BaXWpkU7UnA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Learning optimal boost weight [ML]

2014-09-04 Thread Ivan Brusic

I have something similar which uses search analytics to determine relevant
filters. No plugin or framework since everything works on the client side
during the creation of the query. The process is far from ideal and is
currently very conservative, providing only a slight boost. It does not
work only multiple fields, just on the different values for a single field.

For boosting, I use filtered subsets. The process works well and the boost
is applied after Lucene scoring:

http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/function-score-filters.html

functions: [
{
filter: {
term: {
tag: 34252366377419
}
},
boost_factor: 1.1
}
]

Cheers,

Ivan

On Thu, Sep 4, 2014 at 8:16 AM, NM n.maisonne...@gmail.com wrote:

Hi,

i have several fields playing in the query.

some fields are more important than other , requiring to set boost factors.

I would like to automatically learn the optimal boost weight for each
field based on a training data set.

is there a plugin or framework to do that nicely with ES?

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/83314839-18cf-4ff8-88d7-e870485cb5f5%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/83314839-18cf-4ff8-88d7-e870485cb5f5%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQD7DY%3DFN%3DH%3DMsZThuo4b87X3Y6JsH-OzUPPiDtVh2116A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: should ES_HEAP_SIZE be less than 31G?

2014-09-03 Thread Ivan Brusic

The actual limitation in Java is compressed pointers:

http://docs.oracle.com/javase/7/docs/technotes/guides/vm/performance-enhancements-7.html#compressedOop

Thankfully Elasticsearch can run multiple nodes on the same server. Just
pay attention to the direct size (off heap memory), mlockall settings and
Lucene merge settings, as well as the allocation property referenced by
David. Elasticsearch scales quite nicely vertically, so I would explore
that option if possible.

Cheers,

Ivan

On Wed, Sep 3, 2014 at 9:40 AM, vineeth mohan vm.vineethmo...@gmail.com
wrote:

Hi ,

I am not sure if there is anything related to ES that makes a high RAM
problematic , but in general very high memory is not advised in java
systems. One of the reasons being the cobra effect which stems from the
fact that garbage collection might need more time for higher memory.

COBRA EFFECT -
https://plumbr.eu/blog/increasing-heap-size-beware-of-the-cobra-effect

Thanks
Vineeth

On Wed, Sep 3, 2014 at 10:04 PM, Jinyuan Zhou zhou.jiny...@gmail.com
wrote:

I read somewhere that the ES_HEAP_SIZE is best to be less than 31G, in
this case JVM can use an 32 bit number to address memory locations. If
my server have about 64G ram. This seems a perfect. But what If I have a
server with 128G RAM and sufficient CPU's. Which configuration is better
a) two nodes on server each with ES_HEAP_SIZE say about 31G. b) one node
on the server with ES_HEAP_SIZE say 64G.
Thanks,

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/03180561-2e7d-4e04-ab97-9d8b3a922f86%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/03180561-2e7d-4e04-ab97-9d8b3a922f86%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAGdPd5noOXMuVBZLoz9dJjxko5z4LmWBdk6aSTY2W2sUW7MbuA%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAGdPd5noOXMuVBZLoz9dJjxko5z4LmWBdk6aSTY2W2sUW7MbuA%40mail.gmail.com?utm_medium=emailutm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQCF25cqzkhZ%2B2ooPfjFMyiSonWdKN%2BfPaiJvkEimK7Y0Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Looking for Elasticsearch projects

2014-09-03 Thread Ivan Brusic

Thanks Jörg.

The incentives for an open-source project is to pad my resume since I have
been working with obsolete technologies and processes for almost the past
three years. I implemented many changes at my company (Elasticsearch,
Maven, central logging, application-level monitoring), but there is only so
much one person can do. Plus, I love this stuff. The incentives for simply
contracting is purely money! Do not really need the cash, but I plan to
embark on some travels and it would easy my mind a bit.

Your project list reminds me of a project I have been working on, but I
could use some help. I am looking for datasets that also include example
queries and golden records for those queries. My goal is to test different
similarity algorithms using unknown data. Would love to use the Wikipedia
dump, but I never found any golden records. Perhaps Nik has something. The
only thing I have found are the TREC datasets, but I was hoping for a more
sizable example.

Ivan

On Wed, Sep 3, 2014 at 12:36 PM, joergpra...@gmail.com
joergpra...@gmail.com wrote:

If you want to lend a hand for interesting projects, here are some of my
current favorites:

- building a global library catalog index with Elasticsearch of all the
open data / metadata on academic library servers, complete with harvester
and updater, SRU, OAI etc. A starting point for SRU implementation is
https://github.com/xbib/elasticsearch-sru

- implementing a plugin for Elasticsearch that turns ES into a W3C Linked
Data Platform
https://dvcs.w3.org/hg/ldpwg/raw-file/default/ldp-primer/ldp-primer.html,
with HTTP PATCH support, JSON Patch RFC 6902, maybe even a Sparql-to-ES DSL
translator

- a harvester/pull plugin framework for ES, in order to supersede the
river singleton concept, with provisioning for all kind of different
sources, e.g. JDBC, or web crawling

- helping British Library labs to find correct image legend texts in OCR
XML from the book scanning project. See
http://www.bbc.com/news/technology-28976849 I think Elasticsearch can
handle the 230G zipped input. I got a copy from BL. No good algorithm
exists yet. Maybe with ES? First step would be to design an index and to
index/publish the OCR for better search?

Not sure where the incentives are. Ever lasting fame, honor, glory, world
domination, super power etc.

Jörg

On Wed, Sep 3, 2014 at 1:47 AM, Nikolas Everett nik9...@gmail.com wrote:

We could always use help with CirrisSearch. It is the open source project
that links MediaWiki to Elasticsearch. We have it installed on all the
wikis at the wikimedia foundation but it isn't the default search backend
on the largest ones yet.

Selling points:
Huge user community
Basic queries work reasonably well
Expert syntax to support power users
PHP
Elastica
I manage the elasticsearch installation
I contribute changes we need upstream
Uses customized highlighter (also needs contributors)
Reasonably easy development installation with vagrant
Working on it is my full time job so review would be quick

Nik
On Sep 2, 2014 6:51 PM, Ivan Brusic i...@brusic.com wrote:

For those that are not regulars on the mailing list, I am a fairly
active member that has used Elasticsearch for years.

I am leaving my full-time job to focus on other (techie and non-techie)
goals and would love to work on some interesting projects part-time. It can
be either paid assignments or free open-source projects. My main interests
are search with a focus on development. Not too keen on devops tasks such
as administering servers. I would rather work on my own stuff than be a
sysadmin. :)

Feel free to contact me directly via email.

Cheers,

Ivan

--
You received this message because you are subscribed to the Google
Groups elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQC1U5cFm_wOOx378Aq0nwQRCwOQShLSaqLxmY7qMJOnEQ%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQC1U5cFm_wOOx378Aq0nwQRCwOQShLSaqLxmY7qMJOnEQ%40mail.gmail.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAPmjWd1NOc2YH8_G0Q0cR6pvGhHh%3DjKk0P8SivNsWVzOU2BKiw%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAPmjWd1NOc2YH8_G0Q0cR6pvGhHh%3DjKk0P8SivNsWVzOU2BKiw%40mail.gmail.com?utm_medium=emailutm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups

Re: Exists filter does not respect must_not bool filter

2014-09-03 Thread Ivan Brusic

Is giving.assignee a sub-object or a nested document? Can you provide your
mapping? Use the mapping API for exact results (
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-get-mapping.html
)

Perhaps enabling explain would provide some hints,

-- 
Ivan


On Wed, Sep 3, 2014 at 2:07 PM, ElasticRabbit ayushsang...@gmail.com
wrote:

 Hi Everyone,

 Goal: I want to find all the documents which does not have giving.assignee
 field.

 I am executing below query on ES version 1.3.2 involving exists filter and
 boolean filter.

 {
 query: {
 filtered: {
query: {
   match_all: {}
},
filter: {
bool: {
must_not: [
   {
   exists: {
  field: giving.assignee
   }
   }
]
}
}
 }
 },
 size: 2000
 }

 While executing this query it gives me those documents also where
 giving.assignee field exists or has some value in it.
 We have around 2 million documents and it's returning almost close to 2
 million documents.

 I have also tried using the missing filter but no luck.

 {
 query: {
 filtered: {
 query: {
 match_all: {}
 },
 filter: {
 bool: {
 must: [
 {
 missing: {
 field: giving.assignee
 }
 }
 ]
 }
 }
 }
 },
 size: 2
 }

 Same result as of the above query.
 If someone can point me what am I doing wrong here or if further
 information is needed please let me know.
 Looking forward for help.

 Thanks,
 Ayush Sangani


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/89f2193a-5f11-447f-901c-29790318ddbf%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/89f2193a-5f11-447f-901c-29790318ddbf%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQBxKY4_GOfEP2PXhd0on16KZvT6Z%3D%2Bx2zwci%3D9KHJy6sQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Looking for Elasticsearch projects

2014-09-03 Thread Ivan Brusic

I did not realize Jörg's response was to the list and not privately (as
most other responses were). I am thankful that I did not bad mouth my
employer too badly! :)

I am very aware of the open relevancy project and its discontinued status.
I emailed the Lucene mailing list about it not to long ago. Would love to
work on something in that regard.

Cheers,

Ivan

On Wed, Sep 3, 2014 at 2:16 PM, Itamar Syn-Hershko ita...@code972.com
wrote:

On Thu, Sep 4, 2014 at 12:10 AM, Ivan Brusic i...@brusic.com wrote:

Thanks Jörg.

The incentives for an open-source project is to pad my resume since I
have been working with obsolete technologies and processes for almost the
past three years. I implemented many changes at my company (Elasticsearch,
Maven, central logging, application-level monitoring), but there is only so
much one person can do. Plus, I love this stuff. The incentives for simply
contracting is purely money! Do not really need the cash, but I plan to
embark on some travels and it would easy my mind a bit.

Ivan, I'm actually working on something like this (and I don't thing Jorg
actually meant that..). I was involved with
https://lucene.apache.org/openrelevance/ but its now discontinued and in
some spare time I have I'm trying to take that initiative forward.

Ping me privately if that sounds interesting and we can continue
discussing.

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer Consultant
Author of RavenDB in Action http://manning.com/synhershko/

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZsHdaaXFZVTH_rUDAvWzYXPnSoegKPOqs-km5AXJPkQfg%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZsHdaaXFZVTH_rUDAvWzYXPnSoegKPOqs-km5AXJPkQfg%40mail.gmail.com?utm_medium=emailutm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQAP6OuuxpAJ_ipJVX208AQZsjZrj1Sn2%2BmZbSvP%3DJna2A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Dynamic mapping stops at a field called _id

2014-09-03 Thread Ivan Brusic

The _id field is one of the few reserved field names in Elasticsearch:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-id-field.html

You can set it to whatever you want, as long as it is not an object or
(empty) array, as in your case. I have no idea what the proper behavior
should be when you try to index bad data to this field, but I am not
surprised that there are issues.

-- 
Ivan


On Wed, Sep 3, 2014 at 3:12 AM, André Hänsel an...@webkr.de wrote:

 When indexing a document, when there is a field named _id in the source,
 the fields that come after this one don't get added to the mapping, even
 though they are stored in the _source:

 PUT /megacorp/blah/3
 {
   a: b,
   _id : {},
   c: d
 }

 The document got stored:

 GET megacorp/blah/_search
 {
took: 8,
timed_out: false,
_shards: {
   total: 5,
   successful: 5,
   failed: 0
},
hits: {
   total: 1,
   max_score: 1,
   hits: [
  {
 _index: megacorp,
 _type: blah,
 _id: 3,
 _score: 1,
 _source: {
a: b,
_id: {},
c: d
 }
  }
   ]
}
 }

 But no mapping was added for c:

 GET /megacorp/_mapping/blah
 {
megacorp: {
   mappings: {
  blah: {
 properties: {
a: {
   type: string
}
 }
  }
   }
}
 }

 Is this expected behavior?

 Regards,
 André

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/4498c028-9013-479b-a8ee-c43b8edc277f%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/4498c028-9013-479b-a8ee-c43b8edc277f%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQBGS4aHofD7cFmPs_320SOZ3SBi-rVjpx6DHXgFu6i4-w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

1 2 3 4 5 6 >

1 - 100 of 513 matches

Mail list logo