Does it mean your applying the "reuters" analyzer on your base64
encoded pictures?

I guess it generates a really huge number of tokens for each entry
because of your nGram filter (with a max at 250).

Cédric Hourcade
c...@wal.fr


On Fri, Jun 20, 2014 at 9:09 AM, Tanguy Bernard
<bernardtanguy1...@gmail.com> wrote:
> Information
> My "note_source" contain picture (.jpg, .png ...) in base64 and text.
>
> For my mapping I have used :
> "type" => "string"
> "analyzer" => "reuteurs" (the name of my analyzer)
>
>
> Any idea ?
>
> Le jeudi 19 juin 2014 17:57:46 UTC+2, Tanguy Bernard a écrit :
>>
>> Hello
>> I have some issue, when I index a particular data "note_source" (sql
>> longtext).
>> I use the same analyzer for each fields (except date_source and id_source)
>> but for "note_source", I have a "warn monitor.jvm".
>> When I remove "note_source", everything fine. If I don't use analyzer on
>> "note_source", everything fine, but if I use my analyzer on "note_source" I
>> have some crash.
>>
>> I think I have enough memory, I have used ES_HEAP_SIZE.
>> Maybe my problem it's with accent (ascii, utf-8)
>>
>> Can you help me with this ?
>>
>>
>>
>> My Setting
>>
>>  public function createSetting($pf){
>>         $params = array('index' => $pf, 'body' => array(
>>         'settings' => array(
>>             'number_of_shards' => 5,
>>             'number_of_replicas' => 0,
>>             'analysis' => array(
>>                 'filter' => array(
>>                     'nGram' => array(
>>                         "token_chars" =>array(),
>>                         "type" => "nGram",
>>                         "min_gram" => 3,
>>                         "max_gram"  => 250
>>                     )
>>                 ),
>>                 'analyzer' => array(
>>                     'reuters' => array(
>>                         'type' => 'custom',
>>                         'tokenizer' => 'standard',
>>                         'filter' => array('lowercase', 'asciifolding',
>> 'nGram')
>>                     )
>>                 )
>>             )
>>         )
>>         ));
>>         $this->elasticsearchClient->indices()->create($params);
>>         return;
>> }
>>
>>
>> My Indexing
>>
>> public function indexTable($pf,$typeElement){
>>
>>         $params =array(
>>             "index" =>'_river',
>>             "type" => $typeElement,
>>             "id" => "_meta",
>>             "body" =>array(
>>
>>                 "type" => "jdbc",
>>                 "jdbc" => array(
>>                     "url" => "jdbc:mysql://ip/name",
>>                     "user" => 'root',
>>                     "password" => 'mdp',
>>                     "index" => $pf,
>>                     "type" => $typeElement,
>>                     "sql" => select id_source as _id, id_sous_theme,
>> titre_source, desc_source, note_source, adresse_source, type_source,
>> date_source from source,
>>                     "max_bulk_requests" => 5,
>>                     )
>>             )
>>
>>         );
>>
>>
>>         $this->elasticsearchClient->index($params);
>> }
>>
>> Thanks in advance.
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/5d93217c-bded-40fa-8fd2-fdac576c57ee%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAJQxjPM8qvsmcxB7Xu4KqN28pfvk%2BcBn5bpV2Emw42M5HzAAUA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to