[ 
https://issues.apache.org/jira/browse/METRON-1677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16593086#comment-16593086
 ] 

Ali Nazemian commented on METRON-1677:
--------------------------------------

Given we are using ES/Solr for a time series use case, bringing timestamp to 
the id generation might be a good idea. We are working on implementing a 
Stellar function to give us a more Lucene friendly id for this case. I will 
share the outcome once it's tested.

> UUIDv4 GUID is not Lucene friendly
> ----------------------------------
>
>                 Key: METRON-1677
>                 URL: https://issues.apache.org/jira/browse/METRON-1677
>             Project: Metron
>          Issue Type: Bug
>            Reporter: Ali Nazemian
>            Priority: Major
>
> Using UUIDv4 by UUID.randomUUID() in Java is not Lucene friendly and impacts 
> Elasticsearch and Solr indexing/search performance and makes it unpredictable 
> sometimes.
> http://blog.mikemccandless.com/2014/05/choosing-fast-unique-identifier-uuid.html
> Moreover, specifying doc id at the client side will impact indexing 
> throughput due to enabling Elasticsearch deduplication policy and changing 
> insert to upsert. Hence, indexing throughput can be increased by providing an 
> ability to disable ID generation at the client side. Currently, the way ID is 
> generated can be overwritten at the config level by replacing Metron default 
> guid via Stellar, but it is not possible to disable it completely to let 
> Elasticsearch decide what ID can be used for the corresponding document.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to