Re: IllegalArgumentException[No type mapped for [43]], version 1.2.1

2014-06-16 Thread joergpra...@gmail.com
I guess you hit the following condition:

- you insert data with bulk indexing

- your index has dynamic mapping and already has huge field mappings

- bulk requests span over many nodes / shards / replicas and introduce tons
of new fields into the dynamic mapping

- you do not wait for bulk responses before sending new bulk requests

That is, ES tries heavily to create the new field mappings but the result
of the new mapping does not make it to the other node in time before new
bulks arrive at the other node. The node just sees there must be a mapping
for a new field, but the cluster state has none to present although the
field was being mapped.

Maybe the cluster state is not sent at all, or it could not be read fully
from disk, or it is stuck somewhere else.

ES tries hard to prevent such conditions by assigning high priority to
cluster state messages that are sent throughout the cluster. Also, ES
avoids flooding of such messages.

Your observation is correct: the longer you execute bulk indexing with the
same type of data (except random data), the number of new field mappings
decreases over time, so the number of new ES cluster state promotions.

You can try the following to tackle this challenge:

- pre-create the field mappings for your indexes, or even better,
pre-create indices and disable dynamic mapping, so no cluster state changes
have to be promoted

- switch to synchronous bulk requests, or reduce concurrency in your bulk
requests. So you let the bulk indexing routine wait for the cluster state
changes to be consistent at all nodes.

- reduce the (perhaps huge) number of field mappings (more a question about
the type of data you index)

- reduce number of nodes (obviously an anti-pattern)

- or reduce replica level (always a good thing for efficiency while using
bulk indexing), to give the cluster some breath to broadcast the new
cluster states in shorter time to the corresponding nodes

Jörg



On Mon, Jun 16, 2014 at 10:34 PM, Brooke Babcock brookebabc...@gmail.com
wrote:

 Thanks for the reply.
 We've checked the log files on all the nodes - no errors or warnings.
 Disks were practically empty - it was a fresh cluster, fresh index.

 We have noticed that the problem occurs less frequently the more data we
 send to the cluster. Our latest theory is that it corrects itself
 (meaning, we are able to get by _id again) once a flush occurs. So by
 sending it more data, we are ensuring that flushes happen more often.


 On Monday, June 16, 2014 8:05:15 AM UTC-5, Alexander Reelsen wrote:

 Hey,

 it seems, as if writing into the translog fails at some stage (from a
 complete birds eye view). Can you check your logfiles, if you ran into some
 weird exceptions before that happens? Also, you did not run out of disk
 space at any time when this has happened?


 --Alex


 On Fri, Jun 6, 2014 at 8:39 PM, Brooke Babcock brooke...@gmail.com
 wrote:

 In one part of our application we use Elasticsearch as an object store.
 Therefore, when indexing, we supply our own _id. Likewise, when accessing a
 document we use the simple GET method to fetch by _id. This has worked well
 for us, up until recently. Normally, this is what we get:

 curl -XGET 'http://127.0.0.1:9200/data-2014.06.06/key/test1?pretty=true'
 {
   _index : data-2014.06.06,
   _type : key,
   _id : test1,
   _version : 1,
   found : true,
   _source:{sData:test data 1}
 }


 Now, we often encounter a recently indexed document that throws the
 following error when we try to fetch it:

 curl -XGET 'http://127.0.0.1:9200/data-2014.06.06/key/test2?pretty=true'
 {
   error:IllegalArgumentException[No type mapped for [43]],
   status:500
 }



 This condition persists anywhere from 1 to 25 minutes or so, at which
 point we no longer receive the error for that document and the GET succeeds
 as normal. From that point on, we are able to consistently retrieve that
 document by _id without issue. But, soon after, we will find a different
 newly indexed document caught in the same bad state.

 We know the documents are successfully indexed. Our bulk sender (which
 uses the Java transport client) indicates no error during indexing and
 we are still able to locate the document by doing an ids query, such as:

 curl -XPOST http://127.0.0.1:9200/data-2014.06.06/key/_search?pretty=
 true -d '
 {
   query: {
 ids: {
   values: [test2]
 }
   }
 }'

 Which responds:
 {
took: 543,
timed_out: false,
_shards: {
   total: 10,
   successful: 10,
   failed: 0
},
hits: {
   total: 1,
   max_score: 1.0,
   hits: [ {
  _index: data-2014.06.06,
  _type: key,
  _id: test2,
  _score: 1.0,
  _source:{sData: test data 2}
   } ]
}
 }


 We first noticed this behavior in version 1.2.0. When we upgraded to
 1.2.1, we deleted all indexes and started with a fresh cluster. We hoped
 our problem would be solved by the big fix that came in 1.2.1, but we are
 still regularly seeing 

IllegalArgumentException[No type mapped for [43]], version 1.2.1

2014-06-06 Thread Brooke Babcock
In one part of our application we use Elasticsearch as an object store. 
Therefore, when indexing, we supply our own _id. Likewise, when accessing a 
document we use the simple GET method to fetch by _id. This has worked well 
for us, up until recently. Normally, this is what we get:

curl -XGET 'http://127.0.0.1:9200/data-2014.06.06/key/test1?pretty=true'
{
  _index : data-2014.06.06,
  _type : key,
  _id : test1,
  _version : 1,
  found : true,
  _source:{sData:test data 1}
}


Now, we often encounter a recently indexed document that throws the 
following error when we try to fetch it:

curl -XGET 'http://127.0.0.1:9200/data-2014.06.06/key/test2?pretty=true'
{
  error:IllegalArgumentException[No type mapped for [43]],
  status:500
}



This condition persists anywhere from 1 to 25 minutes or so, at which point 
we no longer receive the error for that document and the GET succeeds as 
normal. From that point on, we are able to consistently retrieve that 
document by _id without issue. But, soon after, we will find a different 
newly indexed document caught in the same bad state.

We know the documents are successfully indexed. Our bulk sender (which uses 
the Java transport client) indicates no error during indexing and we are 
still able to locate the document by doing an ids query, such as:

curl -XPOST http://127.0.0.1:9200/data-2014.06.06/key/_search?pretty=true; 
-d '
{
  query: {
ids: {
  values: [test2]
}
  }
}'

Which responds:
{
   took: 543,
   timed_out: false,
   _shards: {
  total: 10,
  successful: 10,
  failed: 0
   },
   hits: {
  total: 1,
  max_score: 1.0,
  hits: [ {
 _index: data-2014.06.06,
 _type: key,
 _id: test2,
 _score: 1.0,
 _source:{sData: test data 2}
  } ]
   }
}


We first noticed this behavior in version 1.2.0. When we upgraded to 1.2.1, 
we deleted all indexes and started with a fresh cluster. We hoped our 
problem would be solved by the big fix that came in 1.2.1, but we are still 
regularly seeing it. Although our situation may sound like the routing bug 
introduced in 1.2.0, we are certain that it is not. This appears to be a 
significant issue with the translog - we hope the developers will be able 
to look at what may have changed. We did not notice this problem in version 
1.1.1.

Just in case, here is the mapping being used:
curl -XGET 'http://127.0.0.1:9200/data-2014.06.06/key/_mapping?pretty=true'
{
  data-2014.06.06 : {
mappings : {
  key : {
_all : {
  enabled : false
},
properties : {
  sData : {
type : string,
index : no
  }
}
  }
}
  }
}


Thanks for your help.



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/20c45cf8-3459-47f5-8cc3-1e63c93b2c0c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.