Cannot specify a query in the target index and through es.query when working with ES, Wikipedia River and Hive

2015-04-20 Thread Gordon
(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Job Submission failed with exception 
'org.elasticsearch.hadoop.EsHadoopIllegalArgumentException(Cannot specify a 
query in the target index and through es.query)'
FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.exec.mr.MapRedTask


Regards,

Gordon

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ffaa8665-8123-4902-b33a-fb0e28488938%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: behavior of scrolling search with from parameter in newer ES versions

2015-01-21 Thread Gordon Tillman
Thanks Martijn,

That is what I have observed.  But is is a regression from ES version 1.0.1 
and before. And I can't find anything that even implies that `from` is not 
supported for a scroll search, except of course when `search_type=scan`.

I would love to get support for that back into the product.  

--g



On Wednesday, January 21, 2015 at 2:57:29 AM UTC-6, Martijn v Groningen 
wrote:

 Hi Gordon,

 This `from` is kind of ignored for scroll search. I don't remember why 
 that was the case, but it seems to me that scroll search can/should take 
 into account the `from` option during the first scroll search request.

 Martijn

 On 20 January 2015 at 23:33, Gordon Tillman gor...@gmail.com 
 javascript: wrote:

 Greetings All,

 I ran into an interesting issue when upgrading from ES version 1.0.1 to 
 newer versions.  In particular, I tested the following with versions 1.2.4, 
 1.3.4 and 1.4.2.

 *summary*

 When doing a normal scroll search (not one with a *search_type=scan*), 
 it appears that the from parameter is being ignored.


 *example 1 (no scrolling, result is correct)*

 In this simplified example, only one document matches the supplied query, 
 so with *from=1* no documents are returned.

 curl 
 localhost:9200/hm-community-alias/FileInfo/_search?q=parents:c10ed0583104036a94e110f0a8b5fd7d4\from=1

 *example 2 (with scrolling, incorrect result)*

 In this example, where we specify the same query and from parameters, but 
 also specify a scroll parameter, we incorrectly get the single document 
 returned from the query.

 curl 
 localhost:9200/hm-community-alias/FileInfo/_search?q=parents:c10ed0583104036a94e110f0a8b5fd7d4\from=1\scroll=2s
  
 | json

 *notes*


1. Both of the above test cases work correctly in version 1.0.1
2. In newer versions (where example 2 fails), I noticed that the 
*from* value is not present in the data that was returned from the 
query.


 I understand with with a *search_type=scan, *this would be expected 
 because sorting is disabled.  Also, please note this is a greatly 
 simplified query just for illustration purposes.  I've attached a sample of 
 an actual query at the bottom of this post, and *it does contain sorting 
 specifications*.

 Is this a known issue or is this being done by design in the newer ES 
 versions.  A quick scan through the release notes was unrevealing.

 Many thanks for any insight!

 --g

 *sample of full query*


 {
 sort: [
 {
 _type: {
 order: asc, 
 ignore_unmapped: true
 }
 }, 
 {
 name_lower: {
 order: asc, 
 ignore_unmapped: true
 }
 }, 
 {
 dds_key: {
 order: asc, 
 ignore_unmapped: true
 }
 }
 ], 
 query: {
 filtered: {
 filter: {
 and: [
 {
 or: [
 {
 term: {
 parent: 
 c10ed0583104012f94e11ad0ac36f2aaf
 }
 }
 ]
 }, 
 {
 not: {
 term: {
 vcn: DeleteMarker
 }
 }
 }, 
 {
 not: {
 exists: {
 field: notfinalized
 }
 }
 }
 ]
 }, 
 query: {
 match_all: {}
 }
 }
 }, 
 from: 1, 
 size: 1000
 }

  -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/ce8145ac-59c6-4a22-b4c4-eacb90281c4d%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/ce8145ac-59c6-4a22-b4c4-eacb90281c4d%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




 -- 
 Met vriendelijke groet,

 Martijn van Groningen
  

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a4f5abd9-2f1f-4d1d-8168-260b4769d268%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: behavior of scrolling search with from parameter in newer ES versions

2015-01-21 Thread Gordon Tillman
Thank you Martijn!

On Wednesday, January 21, 2015 at 8:04:23 AM UTC-6, Martijn v Groningen 
wrote:

 I agree, this should be fixed: 
 https://github.com/elasticsearch/elasticsearch/issues/9373

 On 21 January 2015 at 14:27, Gordon Tillman gor...@gmail.com 
 javascript: wrote:

 Thanks Martijn,

 That is what I have observed.  But is is a regression from ES version 
 1.0.1 and before. And I can't find anything that even implies that `from` 
 is not supported for a scroll search, except of course when 
 `search_type=scan`.

 I would love to get support for that back into the product.  

 --g



 On Wednesday, January 21, 2015 at 2:57:29 AM UTC-6, Martijn v Groningen 
 wrote:

 Hi Gordon,

 This `from` is kind of ignored for scroll search. I don't remember why 
 that was the case, but it seems to me that scroll search can/should take 
 into account the `from` option during the first scroll search request.

 Martijn

 On 20 January 2015 at 23:33, Gordon Tillman gor...@gmail.com wrote:

 Greetings All,

 I ran into an interesting issue when upgrading from ES version 1.0.1 to 
 newer versions.  In particular, I tested the following with versions 
 1.2.4, 
 1.3.4 and 1.4.2.

 *summary*

 When doing a normal scroll search (not one with a *search_type=scan*), 
 it appears that the from parameter is being ignored.


 *example 1 (no scrolling, result is correct)*

 In this simplified example, only one document matches the supplied 
 query, so with *from=1* no documents are returned.

 curl localhost:9200/hm-community-alias/FileInfo/_search?q=parents:
 c10ed0583104036a94e110f0a8b5fd7d4\from=1

 *example 2 (with scrolling, incorrect result)*

 In this example, where we specify the same query and from parameters, 
 but also specify a scroll parameter, we incorrectly get the single 
 document 
 returned from the query.

 curl localhost:9200/hm-community-alias/FileInfo/_search?q=parents:
 c10ed0583104036a94e110f0a8b5fd7d4\from=1\scroll=2s | json

 *notes*


1. Both of the above test cases work correctly in version 1.0.1
2. In newer versions (where example 2 fails), I noticed that the 
*from* value is not present in the data that was returned from the 
query.


 I understand with with a *search_type=scan, *this would be expected 
 because sorting is disabled.  Also, please note this is a greatly 
 simplified query just for illustration purposes.  I've attached a sample 
 of 
 an actual query at the bottom of this post, and *it does contain 
 sorting specifications*.

 Is this a known issue or is this being done by design in the newer ES 
 versions.  A quick scan through the release notes was unrevealing.

 Many thanks for any insight!

 --g

 *sample of full query*


 {
 sort: [
 {
 _type: {
 order: asc, 
 ignore_unmapped: true
 }
 }, 
 {
 name_lower: {
 order: asc, 
 ignore_unmapped: true
 }
 }, 
 {
 dds_key: {
 order: asc, 
 ignore_unmapped: true
 }
 }
 ], 
 query: {
 filtered: {
 filter: {
 and: [
 {
 or: [
 {
 term: {
 parent: 
 c10ed0583104012f94e11ad0ac36f2aaf
 }
 }
 ]
 }, 
 {
 not: {
 term: {
 vcn: DeleteMarker
 }
 }
 }, 
 {
 not: {
 exists: {
 field: notfinalized
 }
 }
 }
 ]
 }, 
 query: {
 match_all: {}
 }
 }
 }, 
 from: 1, 
 size: 1000
 }

  -- 
 You received this message because you are subscribed to the Google 
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send 
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/ce8145ac-59c6-4a22-b4c4-eacb90281c4d%
 40googlegroups.com 
 https://groups.google.com/d/msgid/elasticsearch/ce8145ac-59c6-4a22-b4c4-eacb90281c4d%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




 -- 
 Met vriendelijke groet,

 Martijn van Groningen
  



 -- 
 Met vriendelijke groet,

 Martijn van Groningen
  

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop

behavior of scrolling search with from parameter in newer ES versions

2015-01-20 Thread Gordon Tillman
Greetings All,

I ran into an interesting issue when upgrading from ES version 1.0.1 to 
newer versions.  In particular, I tested the following with versions 1.2.4, 
1.3.4 and 1.4.2.

*summary*

When doing a normal scroll search (not one with a *search_type=scan*), it 
appears that the from parameter is being ignored.


*example 1 (no scrolling, result is correct)*

In this simplified example, only one document matches the supplied query, 
so with *from=1* no documents are returned.

curl 
localhost:9200/hm-community-alias/FileInfo/_search?q=parents:c10ed0583104036a94e110f0a8b5fd7d4\from=1

*example 2 (with scrolling, incorrect result)*

In this example, where we specify the same query and from parameters, but 
also specify a scroll parameter, we incorrectly get the single document 
returned from the query.

curl 
localhost:9200/hm-community-alias/FileInfo/_search?q=parents:c10ed0583104036a94e110f0a8b5fd7d4\from=1\scroll=2s
 
| json

*notes*


   1. Both of the above test cases work correctly in version 1.0.1
   2. In newer versions (where example 2 fails), I noticed that the *from* 
   value is not present in the data that was returned from the query.


I understand with with a *search_type=scan, *this would be expected because 
sorting is disabled.  Also, please note this is a greatly simplified query 
just for illustration purposes.  I've attached a sample of an actual query 
at the bottom of this post, and *it does contain sorting specifications*.

Is this a known issue or is this being done by design in the newer ES 
versions.  A quick scan through the release notes was unrevealing.

Many thanks for any insight!

--g

*sample of full query*


{
sort: [
{
_type: {
order: asc, 
ignore_unmapped: true
}
}, 
{
name_lower: {
order: asc, 
ignore_unmapped: true
}
}, 
{
dds_key: {
order: asc, 
ignore_unmapped: true
}
}
], 
query: {
filtered: {
filter: {
and: [
{
or: [
{
term: {
parent: 
c10ed0583104012f94e11ad0ac36f2aaf
}
}
]
}, 
{
not: {
term: {
vcn: DeleteMarker
}
}
}, 
{
not: {
exists: {
field: notfinalized
}
}
}
]
}, 
query: {
match_all: {}
}
}
}, 
from: 1, 
size: 1000
}

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ce8145ac-59c6-4a22-b4c4-eacb90281c4d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Is there a better way to achieve my goal than having multiple completion suggesters on a single index?

2014-07-23 Thread Gordon Rankin
I have an index of photos and need to return completion suggestions based 
on several of the fields:


   - Tags
   - Place
   - Country
   - Date

The simplest way to do this of course would be to create one completion 
suggester and simply feed the various inputs into it when indexing. 

However, I need to receive up to 5 suggestions per field and I need to 
return various different outputs depending on the input (they cannot simply 
have a unified output)

For example:

When the user types T the suggestions should be something like the 
following :

Tags : [Tree, Tiger, Toner]
Place : [Tenerife, The London Eye, Torquay]
Country : [Taiwan, Tanzania]
Date : []

The date field simply stores tags for the month and year [January, 2014] 
enabling suggestions to come back as January when a user types jan and 
gives year suggestions when the user type 20 etc...

In order to achieve this I have set up a different completion suggester 
with varying analyzers for each of the above fields, I then query all four 
suggesters at once in a single request.  Everything works perfectly.

However I am left wondering if there is a better way to achieve this 
functionality.


   - Is there any way to achieve the above with a single completion 
   suggester
   - Are there any concerns/watch outs when querying multiple suggesters in 
   this manner? Performance or otherwise.

Thanks in advance for any advice or suggestions.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/bd670fb7-303f-4719-821c-82b65fec86e9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Is there a better way to achieve my goal than having multiple completion suggesters on a single index?

2014-07-23 Thread Gordon Rankin
Thanks Adrien...

Thanks for your speedy response.  I'm very new to Elasticsearch so it's 
good to know I am doing the write thing.

I guess i'll continue as I am unless anyone else can think of any reason 
not to.

Cheers!
 

On Wednesday, July 23, 2014 2:34:08 PM UTC+1, Adrien Grand wrote:

 Hi Gordon,

 Given your requirements, I think you are doing the right thing. There is 
 no particular concern wrt querying multiple suggesters at the same time.


 On Wed, Jul 23, 2014 at 3:20 PM, Gordon Rankin gor...@ripplesoup.com 
 javascript: wrote:

 I have an index of photos and need to return completion suggestions based 
 on several of the fields:


- Tags
- Place 
- Country
- Date

 The simplest way to do this of course would be to create one completion 
 suggester and simply feed the various inputs into it when indexing. 

 However, I need to receive up to 5 suggestions per field and I need to 
 return various different outputs depending on the input (they cannot simply 
 have a unified output)

 For example:

 When the user types T the suggestions should be something like the 
 following :

 Tags : [Tree, Tiger, Toner]
 Place : [Tenerife, The London Eye, Torquay]
 Country : [Taiwan, Tanzania]
 Date : []

 The date field simply stores tags for the month and year [January, 2014] 
 enabling suggestions to come back as January when a user types jan and 
 gives year suggestions when the user type 20 etc...

 In order to achieve this I have set up a different completion suggester 
 with varying analyzers for each of the above fields, I then query all four 
 suggesters at once in a single request.  Everything works perfectly.

 However I am left wondering if there is a better way to achieve this 
 functionality.


- Is there any way to achieve the above with a single completion 
suggester 
- Are there any concerns/watch outs when querying multiple suggesters 
in this manner? Performance or otherwise.

 Thanks in advance for any advice or suggestions.
  
 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/bd670fb7-303f-4719-821c-82b65fec86e9%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/bd670fb7-303f-4719-821c-82b65fec86e9%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




 -- 
 Adrien Grand
  

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/b10e1f2a-7b3c-4102-a855-aae78da02bea%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Using _msearch with suggesters only?

2014-07-23 Thread Gordon Rankin
I have to query several completion suggesters at the same time.  This is 
easy to do using the _suggest api.

However if I want to query multiple suggesters on different indexes I have 
two choices:


   1. Perform multiple http requests using the _suggest api
   2. Use the _msearch api.


I am currently using option 1 which appears to perform reasonably well so 
far however I would like to only perform a single request if possible so I 
have been playing with the _msearch api.

The problem is when performin a single _msearch request for multiple 
suggesters on multiple indexes, each index also ends up getting queried, 
return all the hits as though I had  also queried with matchall : {}

In order to minimize this issue I have set the size parameter to 0 which 
prevents the hits from being returned.  However I still get a total count 
returned for all the documents in each index.  

This implies to me that Elasticsearch is still doing some unnecessary work 
matching and counting documents in each index when all I really want 
returned are the suggestions.


   - Is Elasticsearch actually performing any work other than for the 
   suggesters?
   - If so is there another way to return suggestion from multiple 
   completion suggesters across several indices in a single request without 
   querying at all?
   - Would using _msearch in this manner be considered a better practice 
   than performing two or more _suggest calls in parallel?





My request currently looks like this:



var request = [{
index: 'users',
type: 'user'
},
{
size : 0,
suggest: {
users_suggest: {
text: term,
completion: {
size : 5,
field: 'users_suggest'
}
}
}
},
{
index: 'photos',
type: 'photo'
},
{
size : 0,
suggest : {
tags_suggest: {
text: term,
completion: {
size : 3,
field: 'tags_suggest'
}
},
place_suggest: {
text: term,
completion: {
size : 3,
field: 'place_suggest'
}
},
country_suggest: {
text: term,
completion: {
size : 3,
field: 'country_suggest'
}
}
}
}];




And the results I am getting returned are as follows :

[{
took: 8,
timed_out: false,
_shards: {
total: 5,
successful: 5,
failed: 0
},
hits: {
total: 28,
max_score: 0,
hits: []
},
suggest: {
users_suggest: [{
text: t,
offset: 0,
length: 1,
options: [***suggestions***]
}]
}
}, {
took: 8,
timed_out: false,
_shards: {
total: 5,
successful: 5,
failed: 0
},
hits: {
total: 117,
max_score: 0,
hits: []
},
suggest: {
country_suggest: [{
text: t,
offset: 0,
length: 1,
options: []
}],
place_suggest: [***suggestions***]
}],
tags_suggest: [***suggestions***]
}]
}
}]



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/21f8e109-5741-4ba0-84de-1c41759dea6a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.