Wrong paging with embedded es instance

2014-06-06 Thread Дмитрий Киселев
Hi everyone.

I use embedded ES node as part of java application.

Node node = nodeBuilder().clusterName(OSM-Gazetteer).node();
Client client = node.client();

I try to fetch some data paged.

SearchRequestBuilder searchQ = client.prepareSearch(gazetteer)
.setSearchType(SearchType.QUERY_AND_FETCH)
.setNoFields()
.setQuery(QueryBuilders.matchAllQuery())
.setExplain(false);

searchQ.setSize(PAGE_SIZE);//PAGE_SIZE=5;
searchQ.setFrom(page * PAGE_SIZE);   //page=0;

Here is the querry what I've got generated by client:

{   from : 0,   size : 5,   query : { match_all : { }   },   
explain : false,   fields : [ ] }

curl version returns 5 hits, as expected, but java returns 20 hits:

searchQ.get().getHits().getHits().length; //=20

Index setting are default. Hits do not have duplicates.

Is there some workaround or something?

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/647bddb1-636f-4670-9f9a-b0767001fdd8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Wrong paging with embedded es instance

2014-06-06 Thread David Pilato
Could you print your searchQ object?

May be using a toString()

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

 Le 6 juin 2014 à 16:47, Дмитрий Киселев dmitry.v.kise...@gmail.com a écrit :
 
 Hi everyone.
 
 I use embedded ES node as part of java application.
 
 Node node = nodeBuilder().clusterName(OSM-Gazetteer).node();
 Client client = node.client();
 
 I try to fetch some data paged.
 
 SearchRequestBuilder searchQ = client.prepareSearch(gazetteer)
 .setSearchType(SearchType.QUERY_AND_FETCH)
 .setNoFields()
 .setQuery(QueryBuilders.matchAllQuery())
 .setExplain(false);
 
 searchQ.setSize(PAGE_SIZE);//PAGE_SIZE=5;
 searchQ.setFrom(page * PAGE_SIZE);   //page=0;
 
 Here is the querry what I've got generated by client:
 
 {   from : 0,   size : 5,   query : { match_all : { }   },   
 explain : false,   fields : [ ] }
 
 curl version returns 5 hits, as expected, but java returns 20 hits:
 
 searchQ.get().getHits().getHits().length; //=20
 
 Index setting are default. Hits do not have duplicates.
 
 Is there some workaround or something?
 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/647bddb1-636f-4670-9f9a-b0767001fdd8%40googlegroups.com.
 For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/BEFCD925-5450-4DAC-A121-EAC6EB928371%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.


Re: Wrong paging with embedded es instance

2014-06-06 Thread Дмитрий Киселев
Yep.

System.out.println(searchQ.toString());

{
  from : 0,
  size : 5,
  query : {
match_all : { }
  },
  explain : false,
  fields : [ ]
}

Also, I think it's might be connected with sharding.
I've tried to change number_of_shards to 1 and paging starts act as I
expect.

Set number_of_shards to 5 and get 25 hits.
Set number_of_shards to 4 and get 20 hits.

It's seems like from and size applied to every shard separately in my case.

-- 
Thank you for your time. Best regards.
Dmitry.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAA9QNxOjT5wGmris_4uZ8_33uF6d51hzZ3Z%3DUvBKGbRV-YuR2Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Wrong paging with embedded es instance

2014-06-06 Thread David Pilato
Can you print also the full response object (toString()) as well?

-- 
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr


Le 6 juin 2014 à 19:24:00, Дмитрий Киселев (dmitry.v.kise...@gmail.com) a écrit:

Yep.

System.out.println(searchQ.toString());

{
  from : 0,
  size : 5,
  query : {
    match_all : { }
  },
  explain : false,
  fields : [ ]
}

Also, I think it's might be connected with sharding.
I've tried to change number_of_shards to 1 and paging starts act as I expect.

Set number_of_shards to 5 and get 25 hits.
Set number_of_shards to 4 and get 20 hits.

It's seems like from and size applied to every shard separately in my case.

--
Thank you for your time. Best regards.
Dmitry.
--
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAA9QNxOjT5wGmris_4uZ8_33uF6d51hzZ3Z%3DUvBKGbRV-YuR2Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/etPan.5391fd20.12200854.12ee%40MacBook-Air-de-David.local.
For more options, visit https://groups.google.com/d/optout.


Re: Wrong paging with embedded es instance

2014-06-06 Thread David Pilato
The total hit number is really inconsistent:

With one shard you get: 656523
With 5 shards you get: 3879

I think you are doing something wrong but I can't tell more without looking at 
the full source code.
Could you share how you actually execute the query?

Are you sure your Java client is connected to the right instance/cluster?


-- 
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr


Le 6 juin 2014 à 19:49:47, Дмитрий Киселев (dmitry.v.kise...@gmail.com) a écrit:

This is with 5 shards.
{
  took : 81,
  timed_out : false,
  _shards : {
    total : 5,
    successful : 5,
    failed : 0
  },
  hits : {
    total : 3879,
    max_score : 1.0,
    hits : [ {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0016087997-w162848733-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0088827105-n2270743905-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0097856729-n2270743903-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0145983393-w154644839-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0209772668-n1884206099-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0015203925-w147150792-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0017569140-n2495059507-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0071389729-w147150672-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0130455978-w145925771-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0139624280-w147150701-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-007978-w194531715-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0096243499-w194531714-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0162691059-w164700540-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0202220208-w164698447-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0287053820-n2270743890-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0013765594-w145949343-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0026389358-w147150656-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0093401200-w162848869-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0203517601-n2270743895-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0217217459-n2270743898-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0004074128-w145925740-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0007815983-w175372179-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0039697750-w164700428-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0053473990-w271448695-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0129549665-w162848862-regular,
  _score : 1.0
    } ]
  }
}

And this is with one shard
{
  took : 123,
  timed_out : false,
  _shards : {
    total : 1,
    successful : 1,
    failed : 0
  },
  hits : {
    total : 656523,
    max_score : 1.0,
    hits : [ {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0675290941-w116699544-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0675314442-n1557245109-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0675335611-w210502362-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0675352866-w245359553-regular,
  _score : 1.0
    }, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0675354643-w235622232-regular,
  _score : 1.0
    } ]
  }
}


2014-06-06 19:40 GMT+02:00 David Pilato da...@pilato.fr:
Can you print also the full response object (toString()) as well?

-- 
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr


Le 6 juin 2014 à 

Re: Wrong paging with embedded es instance

2014-06-06 Thread Дмитрий Киселев
Sorry, there littlebit different dataset.
Here is answer with same data and 1 shard
{
  took : 63,
  timed_out : false,
  _shards : {
total : 1,
successful : 1,
failed : 0
  },
  hits : {
total : 3879,
max_score : 1.0,
hits : [ {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0004074128-w145925740-regular,
  _score : 1.0
}, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0007815983-w175372179-regular,
  _score : 1.0
}, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0013765594-w145949343-regular,
  _score : 1.0
}, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0015203925-w147150792-regular,
  _score : 1.0
}, {
  _index : gazetteer,
  _type : location,
  _id : adrpnt-0016087997-w162848733-regular,
  _score : 1.0
} ]
  }
}


2014-06-06 19:54 GMT+02:00 David Pilato da...@pilato.fr:

 The total hit number is really inconsistent:

 With one shard you get: 656523
 With 5 shards you get: 3879

 I think you are doing something wrong but I can't tell more without
 looking at the full source code.
 Could you share how you actually execute the query?

 Are you sure your Java client is connected to the right instance/cluster?


  --
 *David Pilato* | *Technical Advocate* | *Elasticsearch.com*
 @dadoonet https://twitter.com/dadoonet | @elasticsearchfr
 https://twitter.com/elasticsearchfr


 Le 6 juin 2014 à 19:49:47, Дмитрий Киселев (dmitry.v.kise...@gmail.com) a
 écrit:

 This is with 5 shards.
 {
   took : 81,
   timed_out : false,
   _shards : {
 total : 5,
 successful : 5,
 failed : 0
   },
   hits : {
 total : 3879,
 max_score : 1.0,
 hits : [ {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0016087997-w162848733-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0088827105-n2270743905-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0097856729-n2270743903-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0145983393-w154644839-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0209772668-n1884206099-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0015203925-w147150792-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0017569140-n2495059507-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0071389729-w147150672-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0130455978-w145925771-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0139624280-w147150701-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-007978-w194531715-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0096243499-w194531714-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0162691059-w164700540-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0202220208-w164698447-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0287053820-n2270743890-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0013765594-w145949343-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0026389358-w147150656-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0093401200-w162848869-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0203517601-n2270743895-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0217217459-n2270743898-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0004074128-w145925740-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0007815983-w175372179-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0039697750-w164700428-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0053473990-w271448695-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : 

Re: Wrong paging with embedded es instance

2014-06-06 Thread David Pilato
So? What's wrong here?

You asked for 5 docs and you get 5.

I'm missing something I guess.

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

 Le 6 juin 2014 à 20:02, Дмитрий Киселев dmitry.v.kise...@gmail.com a écrit :
 
 Sorry, there littlebit different dataset.
 Here is answer with same data and 1 shard
 {
   took : 63,
   timed_out : false,
   _shards : {
 total : 1,
 successful : 1,
 failed : 0
   },
   hits : {
 total : 3879,
 max_score : 1.0,
 hits : [ {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0004074128-w145925740-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0007815983-w175372179-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0013765594-w145949343-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0015203925-w147150792-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0016087997-w162848733-regular,
   _score : 1.0
 } ]
   }
 }
 
 
 2014-06-06 19:54 GMT+02:00 David Pilato da...@pilato.fr:
 The total hit number is really inconsistent:
 
 With one shard you get: 656523
 With 5 shards you get: 3879
 
 I think you are doing something wrong but I can't tell more without looking 
 at the full source code.
 Could you share how you actually execute the query?
 
 Are you sure your Java client is connected to the right instance/cluster?
 
 
 -- 
 David Pilato | Technical Advocate | Elasticsearch.com
 @dadoonet | @elasticsearchfr
 
 
 Le 6 juin 2014 à 19:49:47, Дмитрий Киселев (dmitry.v.kise...@gmail.com) a 
 écrit:
 
 This is with 5 shards.
 {
   took : 81,
   timed_out : false,
   _shards : {
 total : 5,
 successful : 5,
 failed : 0
   },
   hits : {
 total : 3879,
 max_score : 1.0,
 hits : [ {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0016087997-w162848733-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0088827105-n2270743905-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0097856729-n2270743903-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0145983393-w154644839-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0209772668-n1884206099-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0015203925-w147150792-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0017569140-n2495059507-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0071389729-w147150672-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0130455978-w145925771-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0139624280-w147150701-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-007978-w194531715-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0096243499-w194531714-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0162691059-w164700540-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0202220208-w164698447-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0287053820-n2270743890-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0013765594-w145949343-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0026389358-w147150656-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0093401200-w162848869-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0203517601-n2270743895-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0217217459-n2270743898-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0004074128-w145925740-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0007815983-w175372179-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : 

Re: Wrong paging with embedded es instance

2014-06-06 Thread Дмитрий Киселев
I asked for 5 docs.

With 1 shard - I got 5 docs.
With 5 shards - I got 25 docs.
With 5 shards, using curl instead of embedded java client - I got 5 docs.


2014-06-06 20:20 GMT+02:00 David Pilato da...@pilato.fr:

 So? What's wrong here?

 You asked for 5 docs and you get 5.

 I'm missing something I guess.


 --
 David ;-)
 Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

 Le 6 juin 2014 à 20:02, Дмитрий Киселев dmitry.v.kise...@gmail.com a
 écrit :

 Sorry, there littlebit different dataset.
 Here is answer with same data and 1 shard
 {
   took : 63,
   timed_out : false,
   _shards : {
 total : 1,
 successful : 1,
 failed : 0
   },
   hits : {
 total : 3879,
 max_score : 1.0,
 hits : [ {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0004074128-w145925740-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0007815983-w175372179-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0013765594-w145949343-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0015203925-w147150792-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0016087997-w162848733-regular,
   _score : 1.0
 } ]
   }
 }


 2014-06-06 19:54 GMT+02:00 David Pilato da...@pilato.fr:

 The total hit number is really inconsistent:

 With one shard you get: 656523
 With 5 shards you get: 3879

 I think you are doing something wrong but I can't tell more without
 looking at the full source code.
 Could you share how you actually execute the query?

 Are you sure your Java client is connected to the right instance/cluster?


  --
 *David Pilato* | *Technical Advocate* | *Elasticsearch.com
 http://Elasticsearch.com*
 @dadoonet https://twitter.com/dadoonet | @elasticsearchfr
 https://twitter.com/elasticsearchfr


 Le 6 juin 2014 à 19:49:47, Дмитрий Киселев (dmitry.v.kise...@gmail.com)
 a écrit:

 This is with 5 shards.
 {
   took : 81,
   timed_out : false,
   _shards : {
 total : 5,
 successful : 5,
 failed : 0
   },
   hits : {
 total : 3879,
 max_score : 1.0,
 hits : [ {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0016087997-w162848733-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0088827105-n2270743905-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0097856729-n2270743903-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0145983393-w154644839-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0209772668-n1884206099-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0015203925-w147150792-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0017569140-n2495059507-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0071389729-w147150672-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0130455978-w145925771-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0139624280-w147150701-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-007978-w194531715-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0096243499-w194531714-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0162691059-w164700540-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0202220208-w164698447-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0287053820-n2270743890-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0013765594-w145949343-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0026389358-w147150656-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0093401200-w162848869-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0203517601-n2270743895-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0217217459-n2270743898-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : 

Re: Wrong paging with embedded es instance

2014-06-06 Thread David Pilato
Any chance you could share your code. In particular, i'd like to see how you 
run the query.

If you could reproduce it with a test case that would be awesome.

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

 Le 6 juin 2014 à 20:26, Дмитрий Киселев dmitry.v.kise...@gmail.com a écrit :
 
 I asked for 5 docs.
 
 With 1 shard - I got 5 docs.
 With 5 shards - I got 25 docs.
 With 5 shards, using curl instead of embedded java client - I got 5 docs.
 
 
 2014-06-06 20:20 GMT+02:00 David Pilato da...@pilato.fr:
 So? What's wrong here?
 
 You asked for 5 docs and you get 5.
 
 I'm missing something I guess.
 
 
 --
 David ;-)
 Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
 
 Le 6 juin 2014 à 20:02, Дмитрий Киселев dmitry.v.kise...@gmail.com a 
 écrit :
 
 Sorry, there littlebit different dataset.
 Here is answer with same data and 1 shard
 {
   took : 63,
   timed_out : false,
   _shards : {
 total : 1,
 successful : 1,
 failed : 0
   },
   hits : {
 total : 3879,
 max_score : 1.0,
 hits : [ {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0004074128-w145925740-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0007815983-w175372179-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0013765594-w145949343-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0015203925-w147150792-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0016087997-w162848733-regular,
   _score : 1.0
 } ]
   }
 }
 
 
 2014-06-06 19:54 GMT+02:00 David Pilato da...@pilato.fr:
 The total hit number is really inconsistent:
 
 With one shard you get: 656523
 With 5 shards you get: 3879
 
 I think you are doing something wrong but I can't tell more without 
 looking at the full source code.
 Could you share how you actually execute the query?
 
 Are you sure your Java client is connected to the right instance/cluster?
 
 
 -- 
 David Pilato | Technical Advocate | Elasticsearch.com
 @dadoonet | @elasticsearchfr
 
 
 Le 6 juin 2014 à 19:49:47, Дмитрий Киселев (dmitry.v.kise...@gmail.com) a 
 écrit:
 
 This is with 5 shards.
 {
   took : 81,
   timed_out : false,
   _shards : {
 total : 5,
 successful : 5,
 failed : 0
   },
   hits : {
 total : 3879,
 max_score : 1.0,
 hits : [ {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0016087997-w162848733-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0088827105-n2270743905-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0097856729-n2270743903-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0145983393-w154644839-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0209772668-n1884206099-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0015203925-w147150792-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0017569140-n2495059507-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0071389729-w147150672-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0130455978-w145925771-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0139624280-w147150701-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-007978-w194531715-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0096243499-w194531714-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0162691059-w164700540-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0202220208-w164698447-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0287053820-n2270743890-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0013765594-w145949343-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0026389358-w147150656-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0093401200-w162848869-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : 

Re: Wrong paging with embedded es instance

2014-06-06 Thread Дмитрий Киселев
I've made a snippet.

Code
https://github.com/kiselev-dv/es-test/blob/master/ESPagingTest/src/main/java/test/ESTest.java

Test results
https://github.com/kiselev-dv/es-test/blob/master/test.log
First test 1 shard (test.log line 27) - everything ok
Second test 5 shards (test.log line 86) - error

Search and paging generation
https://github.com/kiselev-dv/es-test/blob/master/ESPagingTest/src/main/java/test/ESTest.java#L73

One more strange thing, search wasn't find anything until I have add small
delay
https://github.com/kiselev-dv/es-test/blob/master/ESPagingTest/src/main/java/test/ESTest.java#L89



2014-06-06 20:53 GMT+02:00 David Pilato da...@pilato.fr:

 Any chance you could share your code. In particular, i'd like to see how
 you run the query.

 If you could reproduce it with a test case that would be awesome.

 --
 David ;-)
 Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

 Le 6 juin 2014 à 20:26, Дмитрий Киселев dmitry.v.kise...@gmail.com a
 écrit :

 I asked for 5 docs.

 With 1 shard - I got 5 docs.
 With 5 shards - I got 25 docs.
 With 5 shards, using curl instead of embedded java client - I got 5 docs.


 2014-06-06 20:20 GMT+02:00 David Pilato da...@pilato.fr:

 So? What's wrong here?

 You asked for 5 docs and you get 5.

 I'm missing something I guess.


 --
 David ;-)
 Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

 Le 6 juin 2014 à 20:02, Дмитрий Киселев dmitry.v.kise...@gmail.com a
 écrit :

 Sorry, there littlebit different dataset.
 Here is answer with same data and 1 shard
 {
   took : 63,
   timed_out : false,
   _shards : {
 total : 1,
 successful : 1,
 failed : 0
   },
   hits : {
 total : 3879,
 max_score : 1.0,
 hits : [ {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0004074128-w145925740-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0007815983-w175372179-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0013765594-w145949343-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0015203925-w147150792-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0016087997-w162848733-regular,
   _score : 1.0
 } ]
   }
 }


 2014-06-06 19:54 GMT+02:00 David Pilato da...@pilato.fr:

 The total hit number is really inconsistent:

 With one shard you get: 656523
 With 5 shards you get: 3879

 I think you are doing something wrong but I can't tell more without
 looking at the full source code.
 Could you share how you actually execute the query?

 Are you sure your Java client is connected to the right instance/cluster?


  --
 *David Pilato* | *Technical Advocate* | *Elasticsearch.com
 http://Elasticsearch.com*
 @dadoonet https://twitter.com/dadoonet | @elasticsearchfr
 https://twitter.com/elasticsearchfr


 Le 6 juin 2014 à 19:49:47, Дмитрий Киселев (dmitry.v.kise...@gmail.com)
 a écrit:

 This is with 5 shards.
 {
   took : 81,
   timed_out : false,
   _shards : {
 total : 5,
 successful : 5,
 failed : 0
   },
   hits : {
 total : 3879,
 max_score : 1.0,
 hits : [ {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0016087997-w162848733-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0088827105-n2270743905-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0097856729-n2270743903-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0145983393-w154644839-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0209772668-n1884206099-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0015203925-w147150792-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0017569140-n2495059507-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0071389729-w147150672-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0130455978-w145925771-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0139624280-w147150701-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-007978-w194531715-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0096243499-w194531714-regular,
   _score : 1.0
 }, {
   _index : gazetteer,
   _type : location,
   _id : adrpnt-0162691059-w164700540-regular,
   _score : 1.0