#general


@pranasblk: @pranasblk has joined the channel
@chad.preisler: I need to transform an encrypted Kafka message before Pinot processes it. Right now for our stream apps we use a custom serde to do it. How can I do it in Pinot? Looks like it would be fairly easy to change Pinot to allow a deserializer to be plugged in. Thoughts?
  @g.kishore: Is the custom serde on related to the format
  @g.kishore: Pinot has the ability to write a decoder
  @chad.preisler: Does the decoder get the raw message from the topic? We will need the entire message unaltered. If so are there any docs on how to use a custom decoder?
@chad.preisler: Seems like Pinot is stuck on an older version of the JDK due to its use of off memory heap APIs that no longer exist. The code does not compile on JDK 15. Also the “shade” plugin does not work on JDK 15. I read JDK 16 has some new methods for using off heap memory. Is there a plan to move to a modern JDK? Is off heap even necessary now that ZGC can handle 16TB of heap with little to no pause time?
  @g.kishore: We plan to deprecate support for Java 8 and support newer JDKs. We still need off heap but we can definitely benefit from ZGC
  @chad.preisler: I’m curios, what is the reason for off heap? Is it the size limitation on arrays?
  @chad.preisler: Also does Pinot build with JDK 15/16? I had trouble building with all tests using JDK 15. The. first issue I saw was with Xerial/larray.
@humengyuk18: Is there a way to specify the group id for Kafka realtime ingestion? What’s the ingestion config key should be?
  @fx19880617: low-level consumer mode is per-partition based, so no need to specify group-id
  @humengyuk18: Is there a way to monitor the latency for consuming when using the low-level consumer?
  @fx19880617: if u want to specify the group id you can try: ```stream.[streamType].consumer.prop.group.id```
  @fx19880617: basically whatever after `stream.[streamType].consumer.prop.` will be put into kafka consumer configs
  @humengyuk18: I see, so the group id has no effect on low-level consumer?
  @fx19880617: no, but it may help report the consuming latency
  @fx19880617: just you cannot use it to reset the consumer
  @humengyuk18: I specified the group id using `stream.kafka.consumer.prop.group.id` , it’s not working, that group id is not shown in kafka.
  @fx19880617: I see, then in that case, we need to implement the API internally to fetch the delay
@ronak: I was exploring TEXT_MATCH functionality with pinot-0.7.0/0.6.0 and had configured one of the columns for it. Is there any configuration for the refresh time interval for the index - After enabling indexing (with `index type: Text` and `encoding type: RAW`) on the column and doing TEXT_MATCH, I was first getting an empty result, but after some time, I was getting the result. So, what is the initial delay for such a column to be searchable? Is any settings/configuration (e.g num of docs, indexed size, etc) for the same?
  @steotia: Hi Ronak, The refresh threshold isn't yet configurable. We can make it though. However, there was a bug I recently fixed. It was causing the lag. Just realized it's in my branch. Will create a PR
  @rishbits1994: So by design/indexing there won’t be any significant lag?
@brianolsen87: @brianolsen87 has joined the channel
@ali: @ali has joined the channel
@brianolsen87: Hey all :wave: Just jumping into this awesome tech called Pinot! I'm a developer advocate from the Trino project (). Tomorrow we're having an episode of the with @fx19880617 and @elon.azoulay about . We're covering the benefits of Trino + Pinot and why you really need Pinot to speed up your common aggregation queries for predictable response times but also gaining the benefit of federated queries over your data lake or other data sources. We'll cover a bit of the specific limitations and current work going on in the Trino-Pinot connector, and finally i'll run a simple demo with the connector! Come watch me crash my docker containers @11am EDT on .
  @srini: don’t miss Brian’s guitar solo at the beginning :musical_note:
  @joshhighley: will you be discussing the 50000 row limit (evaluated rows, not returned rows) Given the size of the datasets intended for Pinot and Trino, this seems like a _really_ low default limit
  @elon.azoulay: Yep
  @brianolsen87: @joshhighley We'll be discussing tommorrow and @elon.azoulay has a pretty neat solution coming in future versions of Trino. See you all tomorrow @11am EDT! :rabbit2::rabbit2:
@ali: :wave: Pinot community! I work with the Presto community, wanted to share PrestoCon Day is next week, feat. some great Pinot talks - @fx19880617 is doing a session on Realtime Analytics with Presto & Apache Pinot and @g.kishore is on The Presto Ecosystem Panel. Super excited for the event - it's virtual & reg is free - hope to see many of you :wine_glass: there :grin::thumbsup:
@asif: @asif has joined the channel
@brianolsen87: @joshhighley We'll be discussing tommorrow and @elon.azoulay has a pretty neat solution coming in future versions of Trino. See you all tomorrow @11am EDT! :rabbit2::rabbit2:
@amherman: @amherman has joined the channel

#random


@pranasblk: @pranasblk has joined the channel
@brianolsen87: @brianolsen87 has joined the channel
@ali: @ali has joined the channel
@asif: @asif has joined the channel
@amherman: @amherman has joined the channel

#troubleshooting


@pranasblk: @pranasblk has joined the channel
@ravi.maddi: *Hi All,* I started quick-start-batch instead quick-start-streaming, how to stop quick-start-batch, any idea?
  @fx19880617: ```➜ ps wuax|grep "org.apache.pinot.tools.Quickstart" xiangfu 79906 100.1 3.0 9461188 1002132 s007 R+ 12:57AM 0:56.62 /Library/Internet Plug-Ins/JavaAppletPlugin.plugin/Contents/Home/bin/java -Xms1G -Xmx1G -Dlog4j2.configurationFile=conf/quickstart-log4j2.xml -Dplugins.dir=/Users/xiangfu/Downloads/apache-pinot-incubating-0.6.0-bin/plugins -classpath /Users/xiangfu/Downloads/apache-pinot-incubating-0.6.0-bin/lib/* -Dapp.name=quick-start-batch -Dapp.pid=79906 -Dapp.repo=/Users/xiangfu/Downloads/apache-pinot-incubating-0.6.0-bin/lib -Dapp.home=/Users/xiangfu/Downloads/apache-pinot-incubating-0.6.0-bin -Dbasedir=/Users/xiangfu/Downloads/apache-pinot-incubating-0.6.0-bin org.apache.pinot.tools.Quickstart ➜ kill 79906```
  @fx19880617: also `ctrl` + `c` if you are on the same terminal session
  @ravi.maddi: Thanks @fx19880617 or bin/pinot-admin.sh StopProcess -controller -server -broker -zooKeeper
@ravi.maddi: Hi All, I am trying to create schema, but I am getting error, this: bin/pinot-admin.sh AddTable -tableConfigFile $PDATA_HOME table-config.json -schemaFile schema-config.json -controllerPort 9000 -exec Executing command: AddTable -tableConfigFile table-config.json -schemaFile schema-config.json -controllerProtocol http -controllerHost 172.31.10.219 -controllerPort 9000 -exec Sending request: to controller: localhost, version: Unknown Got Exception to upload Pinot Schema: myschema org.apache.pinot.common.exception.HttpErrorStatusException: Got error status code: 400 (Bad Request) with reason: "Cannot add invalid schema: myschema. Reason: null" while sending request: to controller: localhost, version: Unknown at org.apache.pinot.common.utils.FileUploadDownloadClient.sendRequest(FileUploadDownloadClient.java:397) ~[pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-d87755899eccba3554e9cc39a1439d5ecb53aaac] Need Help :slightly_smiling_face: Table Config: ```{ "tableName": "mytable", "tableType": "REALTIME", "segmentsConfig": { "timeColumnName": "_source.sDate", "timeType": "MILLISECONDS", "schemaName": "myschema", "replicasPerPartition": "1" }, "tenants": {}, "tableIndexConfig": { "loadMode": "MMAP", "streamConfigs": { "streamType": "kafka", "stream.kafka.consumer.type": "lowlevel", "stream.kafka.topic.name": "mytopic", "stream.kafka.decoder.class.name": "org.apache.pinot.plugin.stream.kafka.KafkaJSONMessageDecoder", "stream.kafka.consumer.factory.class.name": "org.apache.pinot.plugin.stream.kafka20.KafkaConsumerFactory", "stream.kafka.broker.list": "localhost:9876", "realtime.segment.flush.threshold.time": "3600000", "realtime.segment.flush.threshold.size": "50000", "stream.kafka.consumer.prop.auto.offset.reset": "smallest" } }, "metadata": { "customConfigs": {} } }``` And Schema Config: ``` { "schemaName": "myschema", "eventflow": [ { "name": "_index", "dataType": "INT" }, { "name": "_type", "dataType": "STRING" }, { "name": "id", "dataType": "INT" }, { "name": "_source.madids", "datatype": "INT", "singleValueField": false }, ], "dateTimeFieldSpecs": [ { "name": "_source.sDate", "dataType": "STRING", "format": "1:SECONDS:SIMPLE_DATE_FORMAT:yyyy-MM-dd HH:mm:ss", "granularity": "1:DAYS" } ] }```
  @ravi.maddi: Need help Friends
  @ken: What’s the `eventflow` field? Shouldn’t that be named `dimensionFieldSpecs`? And without a `metricFieldSpecs` array of fields for doing aggregations, I’m not sure how you’re going to effectively use Pinot :)
  @chinmay.cerebro: @ravi.maddi here's a good reference for creating a schema: (sample schema included)
  @chinmay.cerebro: In terms of the "Reason:null " I don't see that on the latest master, I'm investigating to see why this could've happened on 0.7
@matteo.santero: Hello, is there a document that explains the “cutoff” time in detail for the data handled by time (and pk in case)? I am asking because it seems that I ve a record that is present in both OFFLINE and REALTIME (with same “primary key”) But when I am looking for it in the final table I am not finding it at all. OFFLINE record TIME 1615891108000 (ms) — max 1615939199415 — min 1612137600000 REALTIME record TIME 1615723114000(ms) — max 1615981903000 — min 1515496517000 FINAL record not present — max 1615981930000 — min 1612137600000
  @mayanks: You can refer to time boundary:
  @mayanks: @jackie.jxt are there any limitations/issues with time boundary computation when time unit is millis?
  @matteo.santero: Thank you for the doc
  @jackie.jxt: There is no limitations. @matteo.santero Does the OFFLINE table has the same records as the REALTIME table for the overlapping time? They should be the same in order to return the correct result
  @matteo.santero: offline and realtime have a set of records that are overlapped during the time, in that overlap some record can have same primary key but different data and different time
  @matteo.santero: in this case the pk was the same in both, the data and the time was different and the result in the “final” one was empty
  @jackie.jxt: I don't fully follow here. Because you mentioned primary key here, I assume you enabled upsert for the table? Upsert only works on realtime only table, but not hybrid table. Also, Why does the real-time records have much wider span than offline records? FYI, here are the definition of the hybrid table:
@yash.agarwal: Hello, Is there any performance difference between the following two queries for pinot `select distinct city from transactions limit 100000` `select city from transactions group by city limit 100000`
  @ken: My guess was no, as implementation-wise it’s a similar operation. But just for grins I tried it on a large dataset (1.7b records) and got similar performance. I’d guess that memory usage would also be similar.
  @mayanks: Hey distinct and group by use different engines internally, even though semantically they mean the same thing and might end up doing similar amount of work.
@brianolsen87: @brianolsen87 has joined the channel
@ali: @ali has joined the channel
@qiaochu: @qiaochu has joined the channel
@ujwala.tulshigiri: @ujwala.tulshigiri has joined the channel
@qiaochu: hello team, i got some error after rebased lastest pinot master when running `mvn test` I think it’s related to the dependency: org.apache.maven.plugins:maven-surefire-plugin:3.0.0-M5:test mvn clean install works perfectly. Is there a solution to fix this error? ```[ERROR] Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:3.0.0-M5:test (default-test) on project pinot-spi: There are test failures. [ERROR] [ERROR] Please refer to /Users/qiaochu/Fork/incubator-pinot/pinot-spi/target/surefire-reports for the individual test results. [ERROR] Please refer to dump files (if any exist) [date].dump, [date]-jvmRun[N].dump and [date].dumpstream. [ERROR] The forked VM terminated without properly saying goodbye. VM crash or System.exit called? [ERROR] Command was /bin/sh -c cd /Users/qiaochu/Fork/incubator-pinot/pinot-spi && /Library/Java/JavaVirtualMachines/adoptopenjdk-11.jdk/Contents/Home/bin/java -javaagent:/Users/qiaochu/.m2/repository/org/jacoco/org.jacoco.agent/0.7.7.201606060606/org.jacoco.agent-0.7.7.201606060606-runtime.jar=destfile=/Users/qiaochu/Fork/incubator-pinot/pinot-spi/target/jacoco.exec -Xms4g -Xmx4g -jar /Users/qiaochu/Fork/incubator-pinot/pinot-spi/target/surefire/surefirebooter12276158737966275331.jar /Users/qiaochu/Fork/incubator-pinot/pinot-spi/target/surefire 2021-03-17T10-07-59_551-jvmRun1 surefire6422534819305887743tmp surefire_05875304854941226633tmp [ERROR] Error occurred in starting fork, check output in log [ERROR] Process Exit Code: 134 [ERROR] org.apache.maven.surefire.booter.SurefireBooterForkException: The forked VM terminated without properly saying goodbye. VM crash or System.exit called? [ERROR] Command was /bin/sh -c cd /Users/qiaochu/Fork/incubator-pinot/pinot-spi && /Library/Java/JavaVirtualMachines/adoptopenjdk-11.jdk/Contents/Home/bin/java -javaagent:/Users/qiaochu/.m2/repository/org/jacoco/org.jacoco.agent/0.7.7.201606060606/org.jacoco.agent-0.7.7.201606060606-runtime.jar=destfile=/Users/qiaochu/Fork/incubator-pinot/pinot-spi/target/jacoco.exec -Xms4g -Xmx4g -jar /Users/qiaochu/Fork/incubator-pinot/pinot-spi/target/surefire/surefirebooter12276158737966275331.jar /Users/qiaochu/Fork/incubator-pinot/pinot-spi/target/surefire 2021-03-17T10-07-59_551-jvmRun1 surefire6422534819305887743tmp surefire_05875304854941226633tmp [ERROR] Error occurred in starting fork, check output in log [ERROR] Process Exit Code: 134 [ERROR] at org.apache.maven.plugin.surefire.booterclient.ForkStarter.fork(ForkStarter.java:748) [ERROR] at org.apache.maven.plugin.surefire.booterclient.ForkStarter.run(ForkStarter.java:305) [ERROR] at org.apache.maven.plugin.surefire.booterclient.ForkStarter.run(ForkStarter.java:265) [ERROR] at org.apache.maven.plugin.surefire.AbstractSurefireMojo.executeProvider(AbstractSurefireMojo.java:1314) [ERROR] at org.apache.maven.plugin.surefire.AbstractSurefireMojo.executeAfterPreconditionsChecked(AbstractSurefireMojo.java:1159) [ERROR] at org.apache.maven.plugin.surefire.AbstractSurefireMojo.execute(AbstractSurefireMojo.java:932) [ERROR] at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:137) [ERROR] at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:210) [ERROR] at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:156) [ERROR] at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:148) [ERROR] at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:117) [ERROR] at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:81) [ERROR] at org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build(SingleThreadedBuilder.java:56) [ERROR] at org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:128) [ERROR] at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:305) [ERROR] at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:192) [ERROR] at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:105) [ERROR] at org.apache.maven.cli.MavenCli.execute(MavenCli.java:957) [ERROR] at org.apache.maven.cli.MavenCli.doMain(MavenCli.java:289) [ERROR] at org.apache.maven.cli.MavenCli.main(MavenCli.java:193) [ERROR] at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [ERROR] at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) [ERROR] at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [ERROR] at java.base/java.lang.reflect.Method.invoke(Method.java:566) [ERROR] at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced(Launcher.java:282) [ERROR] at org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:225) [ERROR] at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode(Launcher.java:406) [ERROR] at org.codehaus.plexus.classworlds.launcher.Launcher.main(Launcher.java:347) [ERROR] [ERROR] -> [Help 1] [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] [ERROR] [ERROR] After correcting the problems, you can resume the build with the command [ERROR] mvn <args> -rf :pinot-spi```
  @fx19880617: this is issue of the test jar using java 11
  @fx19880617: change jdk to 8 should solve it
  @qiaochu: gotcha, thanks @fx19880617! recently i updated java 11 for another project. I will change back
  @fx19880617: sure, I’m thinking of upgrade project jdk from 8 to 11
  @fx19880617:
  @fx19880617: but need some time :stuck_out_tongue:
  @qiaochu: gotcha, thanks for information!
@asif: @asif has joined the channel
@amherman: @amherman has joined the channel

#pinot-dev


@ravi.maddi: @ravi.maddi has joined the channel
@ravi.maddi: *Hi All,* I am getting an error while stating zookeeper with pinot admin. zookeeper state changed (SyncConnected) Waiting for keeper state SyncConnected Terminate ZkClient event thread. Session: 0x10003506d770000 closed Start zookeeper at localhost:2181 in thread main EventThread shut down for session: 0x10003506d770000 Expiring session 0x10002b33f080005, timeout of 30000ms exceeded Expiring session 0x10002b33f080006, timeout of 30000ms exceeded Expiring session 0x10002b33f080007, timeout of 30000ms exceeded Expiring session 0x10002b33f080004, timeout of 30000ms exceeded Expiring session 0x10002b33f080008, timeout of 30000ms exceeded Expiring session 0x10002b33f080002, timeout of 30000ms exceeded Expiring session 0x10002b33f08000b, timeout of 60000ms exceeded I am facing this issue from yesterday morning. And becouse of zookeeper not ready, other components also not working properly. *Need Help* :slightly_smiling_face:
  @fx19880617: how do you start zookeeper?
  @ravi.maddi: _cd /home/ubuntu/pinot_ _bin/pinot-admin.sh StartZookeeper -zkPort 2181_
  @fx19880617: it’s on ur local linux
  @fx19880617: which java version?
  @fx19880617: can you also check if you can start zk on a different port?
  @ravi.maddi: openjdk version "1.8.0_282"
  @ravi.maddi: I check with ps command but I found process running,
  @fx19880617: can you kill it then rerun ?
  @fx19880617: it could be the port is occupied
  @ravi.maddi: i restated the server and tried, same issue again and again.
  @ravi.maddi: even no luck
  @ravi.maddi: I started quick-start-batch instead quick-start-streaming, how to stop quick-start-batch, any idea?
  @ravi.maddi: stop cluster
  @fx19880617: kill the process
  @fx19880617: ctrl+c
  @fx19880617: you can run ```JAVA_OPTS="-Dlog4j2.configurationFile=conf/pinot-ingestion-job-log4j2.xml" bin/pinot-admin.sh StartZookeeper -zkPort 2181```
  @fx19880617: and check the detailed error log
  @ravi.maddi: sure, I will check now
  @ravi.maddi: ctrl +c -- is to stop the cluster?
  @fx19880617: yes
  @ravi.maddi: thanks, stoped
@fx19880617: wanna start a session on this topic:
@fx19880617: upgrade JDK from 8 to 11(?)
  @dlavoie: I guess 11 is the only realistic target since only 17 will be the next LTS
  @g.kishore: does this require a code change and it will make it incompatible with jdk8?
  @dlavoie: Yes and yes, lots of removed modules from Java 11.
  @fx19880617: Yes, it requires code changes
  @fx19880617: I’m on this PR() to make JDK 11 pass the tests

#community


@ali: @ali has joined the channel
@slatermegank: @slatermegank has joined the channel

#announcements


@ali: @ali has joined the channel

#getting-started


@slatermegank: @slatermegank has joined the channel

#segment-write-api


@yupeng: updated the doc. PTAL
@chinmay.cerebro: will do
--------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@pinot.apache.org For additional commands, e-mail: dev-h...@pinot.apache.org

Reply via email to