#general


@jorgarcia1994: @jorgarcia1994 has joined the channel
@g.kishore: "Intro to Pinot" session by @chinmay.cerebro - starting now.
  @wrbriggs: I have to leave early, but I’m looking forward to seeing the rest of the presentation via the recording - thank you @chinmay.cerebro for providing all this information to the community, and @karinwolok1 for organizing this!
@wrbriggs: I have to leave early, but I’m looking forward to seeing the rest of the presentation via the recording - thank you @chinmay.cerebro for providing all this information to the community, and @karinwolok1 for organizing this!

#random


@jorgarcia1994: @jorgarcia1994 has joined the channel
@asmamnoor96: hello guys I am new to pinot any suggestion where to start from
  @fx19880617: hi, welcome! You can start from the pinot documentations for pinot intro:
  @fx19880617: then try out to get start Pinot in your own enviroment:
  @asmamnoor96: okay.. thanx :slightly_smiling_face:
@apadhy: Hey guys,I am planning to do POC on pinot for the real time analytics over kafka wanted to understand does it support joins to multiple kafka topics and how efficient over flink at this time
  @wrbriggs: Pinot by itself does not handle handle joining Kafka streams at ingestion time. It does support defining a dimension table, and using a `lookup` UDF to do star-schema style dimension lookups. It cannot do standard SQL-style `JOIN` operations between multiple tables, but PrestoDB can use Pinot as a back-end to accomplish full ANSI-SQL joins:
  @wrbriggs: Also, you probably will be better off asking questions like this in <#CDRCA57FC|general>, as this <#CDRJ5UE21|random> channel is intended more for non-technical discussion.
  @wrbriggs: if your use case requires Stream -> Stream joins on unbounded inputs (e.g., two infinite Kafka streams), IMO, you would be better off doing that work in Spark, Flink, or Kafka Streams, where you have better control over the time window for data retention as well as the logic for the join and any subsequent transformations or projections of the join result, and then ingesting the output into Pinot for query-time analysis. I’m not affiliated with the Pinot project, so take my opinion with a whole handful of salt - people much smarter than I am might have some better solutions for you :slightly_smiling_face:
  @g.kishore: @wrbriggs is right!
  @ssubrama: @apadhy another possible solution is to use samza before ingesting to pinot
  @wrbriggs: Sorry @ssubrama! I like Samza, I swear! I've never used it in a production use case so I always forget to suggest it :(
  @ssubrama: no issues. At Linkedin, many use cases use samza to process data first before ingesting to Pinot

#troubleshooting


@jorgarcia1994: @jorgarcia1994 has joined the channel
@suraj: Our brokers have been running into direct memory allocation OOM errors. We have allocated 128M. Noticed that the brokers don't crash but catch the exception and log it. The only symptom we see is query timeouts. Would like to understand: *a) what is the direct memory used for ? b) any guidelines to size it ?*
  @g.kishore: its used by netty, 128M is too less if you are moving a lot of data between server and broker
  @g.kishore: increase it to 1G
  @suraj: thanks
@suraj: `2021/01/22 00:15:36.706 ERROR [DataTableHandler] [nioEventLoopGroup-2-3] Caught exception *while* handling response from server: pinot-server-3_R` `java.lang.OutOfMemoryError: Direct buffer memory` `at java.nio.Bits.reserveMemory(Bits.java:175) ~[?:?]` `at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:118) ~[?:?]` `at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:317) ~[?:?]` `at io.netty.buffer.PoolArena$DirectArena.allocateDirect(PoolArena.java:758) ~[pinot-all-0.6.0-jar-*with*-dependencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]` `at io.netty.buffer.PoolArena$DirectArena.newUnpooledChunk(PoolArena.java:748) ~[pinot-all-0.6.0-jar-*with*-dependencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]` `at io.netty.buffer.PoolArena.allocateHuge(PoolArena.java:260) ~[pinot-all-0.6.0-jar-*with*-dependencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]` `at io.netty.buffer.PoolArena.allocate(PoolArena.java:232) ~[pinot-all-0.6.0-jar-*with*-dependencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]` `at io.netty.buffer.PoolArena.reallocate(PoolArena.java:397) ~[pinot-all-0.6.0-jar-*with*-dependencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]` `at io.netty.buffer.PooledByteBuf.capacity(PooledByteBuf.java:119) ~[pinot-all-0.6.0-jar-*with*-dependencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]` `at io.netty.buffer.AbstractByteBuf.ensureWritable0(AbstractByteBuf.java:310) ~[pinot-all-0.6.0-jar-*with*-dependencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]` `at io.netty.buffer.AbstractByteBuf.ensureWritable(AbstractByteBuf.java:281) ~[pinot-all-0.6.0-jar-*with*-dependencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]` `at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:1118) ~[pinot-all-0.6.0-jar-*with*-dependencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]` `at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:1111) ~[pinot-all-0.6.0-jar-*with*-dependencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]` `at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:1102) ~[pinot-all-0.6.0-jar-*with*-dependencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]` `at io.netty.handler.codec.ByteToMessageDecoder$1.cumulate(ByteToMessageDecoder.java:96) ~[pinot-all-0.6.0-jar-*with*-dependencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]` `at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:281) ~[pinot-all-0.6.0-jar-*with*-dependencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]` `at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374) [pinot-all-0.6.0-jar-*with*-dependencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]` `at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360) [pinot-all-0.6.0-jar-*with*-dependencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]` `at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:352) [pinot-all-0.6.0-jar-*with*-dependencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]` `at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1422) [pinot-all-0.6.0-jar-*with*-dependencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]` `at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374) [pinot-all-0.6.0-jar-*with*-dependencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]` `at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360) [pinot-all-0.6.0-jar-*with*-dependencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]` `at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:931) [pinot-all-0.6.0-jar-*with*-dependencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]` `at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163) [pinot-all-0.6.0-jar-*with*-dependencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]` `at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:700) [pinot-all-0.6.0-jar-*with*-dependencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]` `at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:635) [pinot-all-0.6.0-jar-*with*-dependencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]` `at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:552) [pinot-all-0.6.0-jar-*with*-dependencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]` `at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:514) [pinot-all-0.6.0-jar-*with*-dependencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]` `at io.netty.util.concurrent.SingleThreadEventExecutor$6.run(SingleThreadEventExecutor.java:1044) [pinot-all-0.6.0-jar-*with*-dependencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]` `at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [pinot-all-0.6.0-jar-*with*-dependencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]` `at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [pinot-all-0.6.0-jar-*with*-dependencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]` `at java.lang.Thread.run(Thread.java:834) [?:?]`

#pinot-dev


@amrish.k.lal: I get this error while running Quickstart.java (straight out of box). Has anything changed here? ```TotalProcessed time for event: MessageChange took: 18 ms Exception in thread "main" java.lang.RuntimeException: Failed to create IngestionJobRunner instance for class - org.apache.pinot.plugin.ingestion.batch.standalone.SegmentGenerationJobRunner at org.apache.pinot.spi.ingestion.batch.IngestionJobLauncher.kickoffIngestionJob(IngestionJobLauncher.java:137) at org.apache.pinot.spi.ingestion.batch.IngestionJobLauncher.runIngestionJob(IngestionJobLauncher.java:113) at org.apache.pinot.tools.BootstrapTableTool.bootstrapOfflineTable(BootstrapTableTool.java:189) at org.apache.pinot.tools.BootstrapTableTool.execute(BootstrapTableTool.java:99) at org.apache.pinot.tools.admin.command.QuickstartRunner.bootstrapTable(QuickstartRunner.java:207) at org.apache.pinot.tools.Quickstart.execute(Quickstart.java:180) at org.apache.pinot.tools.Quickstart.main(Quickstart.java:223) Caused by: java.lang.ClassNotFoundException: org.apache.pinot.plugin.ingestion.batch.standalone.SegmentGenerationJobRunner at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at org.apache.pinot.spi.plugin.PluginClassLoader.loadClass(PluginClassLoader.java:80) at org.apache.pinot.spi.plugin.PluginManager.createInstance(PluginManager.java:293) at org.apache.pinot.spi.plugin.PluginManager.createInstance(PluginManager.java:264) at org.apache.pinot.spi.plugin.PluginManager.createInstance(PluginManager.java:245) at org.apache.pinot.spi.ingestion.batch.IngestionJobLauncher.kickoffIngestionJob(IngestionJobLauncher.java:135) ... 6 more```
@amrish.k.lal: @fx19880617 I am wondering if this ^^ is related to ?
@fx19880617: shouldn’t be, where you running this quickstart ?
@amrish.k.lal: In IntelliJ, right click and run
@fx19880617: are you running in pinot-distribution directory
@fx19880617: hmm
@fx19880617: are you on master branch?
@amrish.k.lal: yes and without any changes.
@fx19880617: hmm
@fx19880617: have you tried to build the pinot using maven
@fx19880617: I feel we may need to put pinot-batch-ingestion-standalone into runtime dependency so the first time user can run it through IDE
  @amrish.k.lal: I find Quickstart useful for debugging that's why I was running it from intellij.
@amrish.k.lal: Yes, I am able to build sucessfully through `mvn clean install package -DskipTests -Pbin-dist -DdownloadSources -DdownloadJavadocs`
@fx19880617: ok
@fx19880617: can you try onething, add ``` <dependency> <groupId>org.apache.pinot</groupId> <artifactId>pinot-batch-ingestion-standalone</artifactId> <version>${project.version}</version> <scope>runtime</scope> </dependency>``` into pinot-tools/pom.xml
  @amrish.k.lal: ok
  @amrish.k.lal: I still get the same error after selecting Quickstart.java, right click and run/debug Quickstart.main()
  @amrish.k.lal: Is there any other way to run Quickstart under debugger?
  @fx19880617: can you try remove the <scope>runtime</scope>
  @fx19880617: and refresh the module
  @fx19880617:
  @fx19880617: when my intellij build this module
  @fx19880617: it copies all the resources
  @fx19880617:
  @fx19880617: have you tried to build pinot module
  @fx19880617:
  @amrish.k.lal: Trying it...
  @amrish.k.lal: Ahh, works now :slightly_smiling_face:
  @amrish.k.lal: I removed <scope>runtime</scope> and reloaded the pinot-tools project. works fine after that :slightly_smiling_face:
  @amrish.k.lal: Thanks.

#announcements


@jorgarcia1994: @jorgarcia1994 has joined the channel
@g.kishore: "Intro to Pinot" session by @chinmay.cerebro - starting now.
@chinmay.cerebro: @chinmay.cerebro has joined the channel

#pinot-docs


@ken: I was looking into the `cleanUpOutputDir` flag - it doesn’t seem to be documented yet. Seems like this really means `deleteOutputSegmentAfterPush`, yes?
@ken: I was also looking into the `overwriteOutput` flag, which also doesn’t seem to be documented. It seems to have different meanings in the code…in the Hadoop & Spark batch ingest code, it’s checking whether the staging dir/output/ directory already contains the segment tar file before it’s copied from local, which seems odd since this directory is created at the start of the job, and deleted at the end, so there shouldn’t be collisions. In standalone batch ingest it’s checking whether the actual output file exists, which seems correct. And in the `SegmentGenerationAndPushTaskGenerator` it’s documented as “overwriteOutput - Optional, delete the output segment directory if set to true”. I’m guessing the Hadoop/Spark code is a bug, and the flag should be getting checked when the staging/segmentTar/ files are being copied to the output dir, and the documentation in `SegmentGenerationAndPushTaskGenerator` is wrong.

#getting-started


@jorgarcia1994: @jorgarcia1994 has joined the channel
--------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]

Reply via email to