Druid stores strings as UTF-8 and from a storage and query basis, it should work fine with any language. The "wikiticker-2015-09-12-sampled.json.gz" dataset used for the tutorial has strings in a variety of languages (check the "page" field): https://druid.apache.org/docs/latest/tutorials/index.html
So I wonder if there is an encoding problem with reading your input data? If it's in a text format, it should be encoded as UTF-8 for Druid to be able to read it properly. On Fri, Jul 16, 2021 at 7:51 AM Y H <yurim2...@gmail.com> wrote: > hi, i am using druid for develop analytic-web. > And i found druid can't parse language without english > > [image: image.png] > > is there any option on utf-8 OR way to parse string correctly? > > i attached my druid environment file, > please let me know way to parse string in druid > > thanks. > > > > environment > ___________________________________________________ > DRUID_XMS=1g > DRUID_MAXNEWSIZE=250m > DRUID_NEWSIZE=250m > DRUID_MAXDIRECTMEMORYSIZE=6172m > > druid_emitter_logging_logLevel=debug > > druid_extensions_loadList=["druid-stats","druid-histogram", > "druid-datasketches", "druid-lookups-cached-global", > "postgresql-metadata-storage", "druid-kafka-indexing-service", > "druid-kafka-extraction-namespace"] > > druid_zk_service_host=zookeeper > > # kafka config > listeners=PLAINTEXT://211.253.8.155:59092 > > > # druid_metadata_storage_host= > druid_metadata_storage_type=postgresql > > druid_metadata_storage_connector_connectURI=jdbc:postgresql://postgres:5432/druid > druid_metadata_storage_connector_user=druid > druid_metadata_storage_connector_password=FoolishPassword > > druid_coordinator_balancer_strategy=cachingCost > > druid_indexer_runner_javaOptsArray=["-server", "-Xmx1g", "-Xms1g", > "-XX:MaxDirectMemorySize=3g", "-Duser.timezone=UTC", > "-Dfile.encoding=UTF-8", > "-Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager"] > druid_indexer_fork_property_druid_processing_buffer_sizeBytes=268435456 > > druid_storage_type=local > druid_storage_storageDirectory=/opt/data/segments > druid_indexer_logs_type=file > druid_indexer_logs_directory=/opt/data/indexing-logs > > druid_processing_numThreads=2 > druid_processing_numMergeBuffers=2 > > > DRUID_LOG4J=<?xml version="1.0" encoding="UTF-8" ?><Configuration > status="WARN"><Appenders><Console name="Console" > target="SYSTEM_OUT"><PatternLayout pattern="%d{ISO8601} %p [%t] %c - > %m%n"/></Console></Appenders><Loggers><Root level="info"><AppenderRef > ref="Console"/></Root><Logger name="org.apache.druid.jetty.RequestLog" > additivity="false" level="DEBUG"><AppenderRef > ref="Console"/></Logger></Loggers></Configuration> > >