Are you using the console, or an ingestion spec? If you use a spec, you might attach it. If you're using the console, and if the strings have commas in them, maybe .tsv would work, and you can create a file with a different delimiter. (In .tsv, you can choose the delimiter; it doesn't have to be a tab.) Or you can take a screenshot of what's happening and attach that, it might help.
On Fri, Jul 16, 2021 at 11:25 AM Y H <yurim2...@gmail.com> wrote: > thanks! > But i still have problem > > i success to store string as UTF-8 with inline text ingestion. But when i > try to ingest batch type with csv, it encoded awkword. > > the problem seems to happen when read csv. Should i transform csv file to > text file?? and if i ingest batch data with text file, what type of parser > should i choose?(still .*csv ?) > > > > 2021년 7월 17일 (토) 오전 1:46, Gian Merlino <g...@apache.org>님이 작성: > > > Including the original poster in case they are not on the dev list > > themselves (hello!). > > > > On Fri, Jul 16, 2021 at 9:44 AM Gian Merlino <g...@apache.org> wrote: > > > >> Druid stores strings as UTF-8 and from a storage and query basis, it > >> should work fine with any language. The > >> "wikiticker-2015-09-12-sampled.json.gz" dataset used for the tutorial > has > >> strings in a variety of languages (check the "page" field): > >> https://druid.apache.org/docs/latest/tutorials/index.html > >> > >> So I wonder if there is an encoding problem with reading your input > data? > >> If it's in a text format, it should be encoded as UTF-8 for Druid to be > >> able to read it properly. > >> > > > >> > >> On Fri, Jul 16, 2021 at 7:51 AM Y H <yurim2...@gmail.com> wrote: > >> > >>> hi, i am using druid for develop analytic-web. > >>> And i found druid can't parse language without english > >>> > >>> [image: image.png] > >>> > >>> is there any option on utf-8 OR way to parse string correctly? > >>> > >>> i attached my druid environment file, > >>> please let me know way to parse string in druid > >>> > >>> thanks. > >>> > >>> > >>> > >>> environment > >>> ___________________________________________________ > >>> DRUID_XMS=1g > >>> DRUID_MAXNEWSIZE=250m > >>> DRUID_NEWSIZE=250m > >>> DRUID_MAXDIRECTMEMORYSIZE=6172m > >>> > >>> druid_emitter_logging_logLevel=debug > >>> > >>> druid_extensions_loadList=["druid-stats","druid-histogram", > >>> "druid-datasketches", "druid-lookups-cached-global", > >>> "postgresql-metadata-storage", "druid-kafka-indexing-service", > >>> "druid-kafka-extraction-namespace"] > >>> > >>> druid_zk_service_host=zookeeper > >>> > >>> # kafka config > >>> listeners=PLAINTEXT://211.253.8.155:59092 > >>> > >>> > >>> # druid_metadata_storage_host= > >>> druid_metadata_storage_type=postgresql > >>> > >>> > druid_metadata_storage_connector_connectURI=jdbc:postgresql://postgres:5432/druid > >>> druid_metadata_storage_connector_user=druid > >>> druid_metadata_storage_connector_password=FoolishPassword > >>> > >>> druid_coordinator_balancer_strategy=cachingCost > >>> > >>> druid_indexer_runner_javaOptsArray=["-server", "-Xmx1g", "-Xms1g", > >>> "-XX:MaxDirectMemorySize=3g", "-Duser.timezone=UTC", > >>> "-Dfile.encoding=UTF-8", > >>> "-Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager"] > >>> druid_indexer_fork_property_druid_processing_buffer_sizeBytes=268435456 > >>> > >>> druid_storage_type=local > >>> druid_storage_storageDirectory=/opt/data/segments > >>> druid_indexer_logs_type=file > >>> druid_indexer_logs_directory=/opt/data/indexing-logs > >>> > >>> druid_processing_numThreads=2 > >>> druid_processing_numMergeBuffers=2 > >>> > >>> > >>> DRUID_LOG4J=<?xml version="1.0" encoding="UTF-8" ?><Configuration > >>> status="WARN"><Appenders><Console name="Console" > >>> target="SYSTEM_OUT"><PatternLayout pattern="%d{ISO8601} %p [%t] %c - > >>> %m%n"/></Console></Appenders><Loggers><Root level="info"><AppenderRef > >>> ref="Console"/></Root><Logger name="org.apache.druid.jetty.RequestLog" > >>> additivity="false" level="DEBUG"><AppenderRef > >>> ref="Console"/></Logger></Loggers></Configuration> > >>> > >>> >