#general


@momento.corto: @momento.corto has joined the channel
@momento.corto: Hi, do you know if it’s possible to query Pinot from Apache Drill or Dremio?
  @mayanks: Hello, currently no. Pinot currently has connectors to Presto & Trino on that front.
  @momento.corto: Thank you
@xiaoman: @xiaoman has joined the channel

#random


@momento.corto: @momento.corto has joined the channel
@xiaoman: @xiaoman has joined the channel

#troubleshooting


@lrhadoop143: Hi Team,
@lrhadoop143: Hi Team ,I'm trying to setup pinot in docker and load table . I'm Facing issues while loading data into table. ERROR: java.lang.RuntimeException: Failed to read from Schema URI - '', . can you please help me to fix this issue. I'm using this yml file.executionFrameworkSpec: name: 'standalone' segmentGenerationJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentGenerationJobRunner' segmentTarPushJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentTarPushJobRunner' segmentUriPushJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentUriPushJobRunner' jobType: SegmentCreationAndTarPush inputDirURI: '/tmp/pinot-quick-start/rawdata/' includeFileNamePattern: 'glob:**/*.csv' outputDirURI: '/tmp/pinot-quick-start/segments/' overwriteOutput: true pinotFSSpecs: - scheme: file className: org.apache.pinot.spi.filesystem.LocalPinotFS recordReaderSpec: dataFormat: 'csv' className: 'org.apache.pinot.plugin.inputformat.csv.CSVRecordReader' configClassName: 'org.apache.pinot.plugin.inputformat.csv.CSVRecordReaderConfig' tableSpec: tableName: 'transcript' schemaURI: '' tableConfigURI: '' pinotClusterSpecs: - controllerURI: ''
  @mark.needham: Can you see any more information on the error message in the logs?
  @lrhadoop143: at org.apache.pinot.tools.admin.PinotAdministrator.execute(PinotAdministrator.java:161) [pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-6b33448da58992773ee23b863da029650e9ec37f] at org.apache.pinot.tools.admin.PinotAdministrator.main(PinotAdministrator.java:192) [pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-6b33448da58992773ee23b863da029650e9ec37f] Caused by: java.net.ConnectException: Connection refused (Connection refused) at .PlainSocketImpl.socketConnect(Native Method) ~[?:?] at .AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:399) ~[?:?]
  @mark.needham: So does that mean that you also can't navigate to ?
  @lrhadoop143: i can navigate and open that pinot UI
  @lrhadoop143: but not able to load data into table even i created table and schema
  @mark.needham: hmmmm ok, I'm not sure why you'd get a connection refused exception in that case
  @mark.needham: Can you try this spec: ```executionFrameworkSpec:   name: 'standalone'   segmentGenerationJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentGenerationJobRunner'   segmentTarPushJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentTarPushJobRunner'   segmentUriPushJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentUriPushJobRunner' jobType: SegmentCreationAndTarPush inputDirURI: '/tmp/pinot-quick-start/rawdata/' includeFileNamePattern: 'glob:**/*.csv' outputDirURI: '/tmp/pinot-quick-start/segments/' overwriteOutput: true pinotFSSpecs:   - scheme: file     className: org.apache.pinot.spi.filesystem.LocalPinotFS recordReaderSpec:   dataFormat: 'csv'   className: 'org.apache.pinot.plugin.inputformat.csv.CSVRecordReader'   configClassName: 'org.apache.pinot.plugin.inputformat.csv.CSVRecordReaderConfig' tableSpec:   tableName: 'transcript' pinotClusterSpecs:   - controllerURI: ''```
  @mark.needham: also which command did you run to ingest the data?
  @mayanks: @lrhadoop143 If you are unable to access the Pinot console, then most likely the controller is not even running?
  @mayanks: Can you confirm if Pinot is up and running?
@lrhadoop143: docker run --rm -ti --network=pinot-demo -v /tmp/pinot-quick-start:/tmp/pinot-quick-start --name pinot-data-ingestion-job apachepinot/pinot:latest LaunchDataIngestionJob -jobSpecFile /tmp/pinot-quick-start/docker-job-spec.yml
  @mark.needham: looks good to me
  @lrhadoop143: Ok
  @mark.needham: can you try the spec I posted on the other thread?
  @lrhadoop143: No I will try and update you thanks for replying
@bagi.priyank: i am running into this stack trace in log with in a second after adding a real-time table ```021/11/20 00:18:41.296 ERROR [HelixStateTransitionHandler] [HelixTaskExecutor-message_handle_thread] Exception while executing a state transition task km_mp_play_startree__103__0__20211120T0018Z java.lang.reflect.InvocationTargetException: null at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:?] at jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:?] at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:?] at java.lang.reflect.Method.invoke(Method.java:566) ~[?:?] at org.apache.helix.messaging.handling.HelixStateTransitionHandler.invoke(HelixStateTransitionHandler.java:404) ~[pinot-all-0.9.0-jar-with-dependencies.jar:0.9.0-cf8b84e8b0d6ab62374048de586ce7da21132906] at org.apache.helix.messaging.handling.HelixStateTransitionHandler.handleMessage(HelixStateTransitionHandler.java:331) [pinot-all-0.9.0-jar-with-dependencies.jar:0.9.0-cf8b84e8b0d6ab62374048de586ce7da21132906] at org.apache.helix.messaging.handling.HelixTask.call(HelixTask.java:97) [pinot-all-0.9.0-jar-with-dependencies.jar:0.9.0-cf8b84e8b0d6ab62374048de586ce7da21132906] at org.apache.helix.messaging.handling.HelixTask.call(HelixTask.java:49) [pinot-all-0.9.0-jar-with-dependencies.jar:0.9.0-cf8b84e8b0d6ab62374048de586ce7da21132906] at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?] at java.lang.Thread.run(Thread.java:829) [?:?] Caused by: java.lang.OutOfMemoryError: Direct buffer memory at java.nio.Bits.reserveMemory(Bits.java:175) ~[?:?] at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:118) ~[?:?] at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:317) ~[?:?] at org.apache.pinot.segment.spi.memory.PinotByteBuffer.allocateDirect(PinotByteBuffer.java:38) ~[pinot-all-0.9.0-jar-with-dependencies.jar:0.9.0-cf8b84e8b0d6ab62374048de586ce7da21132906] at org.apache.pinot.segment.spi.memory.PinotDataBuffer.allocateDirect(PinotDataBuffer.java:115) ~[pinot-all-0.9.0-jar-with-dependencies.jar:0.9.0-cf8b84e8b0d6ab62374048de586ce7da21132906] at org.apache.pinot.segment.local.io.writer.impl.DirectMemoryManager.allocateInternal(DirectMemoryManager.java:53) ~[pinot-all-0.9.0-jar-with-dependencies.jar:0.9.0-cf8b84e8b0d6ab62374048de586ce7da21132906] at org.apache.pinot.segment.local.io.readerwriter.RealtimeIndexOffHeapMemoryManager.allocate(RealtimeIndexOffHeapMemoryManager.java:80) ~[pinot-all-0.9.0-jar-with-dependencies.jar:0.9.0-cf8b84e8b0d6ab62374048de586ce7da21132906] at org.apache.pinot.segment.local.realtime.impl.forward.FixedByteSVMutableForwardIndex.addBuffer(FixedByteSVMutableForwardIndex.java:208) ~[pinot-all-0.9.0-jar-with-dependencies.jar:0.9.0-cf8b84e8b0d6ab62374048de586ce7da21132906] at org.apache.pinot.segment.local.realtime.impl.forward.FixedByteSVMutableForwardIndex.<init>(FixedByteSVMutableForwardIndex.java:77) ~[pinot-all-0.9.0-jar-with-dependencies.jar:0.9.0-cf8b84e8b0d6ab62374048de586ce7da21132906] at org.apache.pinot.segment.local.indexsegment.mutable.MutableSegmentImpl.<init>(MutableSegmentImpl.java:308) ~[pinot-all-0.9.0-jar-with-dependencies.jar:0.9.0-cf8b84e8b0d6ab62374048de586ce7da21132906] at org.apache.pinot.core.data.manager.realtime.LLRealtimeSegmentDataManager.<init>(LLRealtimeSegmentDataManager.java:1364) ~[pinot-all-0.9.0-jar-with-dependencies.jar:0.9.0-cf8b84e8b0d6ab62374048de586ce7da21132906] at org.apache.pinot.core.data.manager.realtime.RealtimeTableDataManager.addSegment(RealtimeTableDataManager.java:344) ~[pinot-all-0.9.0-jar-with-dependencies.jar:0.9.0-cf8b84e8b0d6ab62374048de586ce7da21132906] at org.apache.pinot.server.starter.helix.HelixInstanceDataManager.addRealtimeSegment(HelixInstanceDataManager.java:162) ~[pinot-all-0.9.0-jar-with-dependencies.jar:0.9.0-cf8b84e8b0d6ab62374048de586ce7da21132906] at org.apache.pinot.server.starter.helix.SegmentOnlineOfflineStateModelFactory$SegmentOnlineOfflineStateModel.onBecomeOnlineFromOffline(SegmentOnlineOfflineStateModelFactory.java:164) ~[pinot-all-0.9.0-jar-with-dependencies.jar:0.9.0-cf8b84e8b0d6ab62374048de586ce7da21132906] at org.apache.pinot.server.starter.helix.SegmentOnlineOfflineStateModelFactory$SegmentOnlineOfflineStateModel.onBecomeConsumingFromOffline(SegmentOnlineOfflineStateModelFactory.java:86) ~[pinot-all-0.9.0-jar-with-dependencies.jar:0.9.0-cf8b84e8b0d6ab62374048de586ce7da21132906] ... 12 more``` i have tried increasing heap size (right now at 16G) and i am still running into this issue. i am using 5 servers to consume from a topic with 128 partitions, with an event rate of about 7M events per minute. I see 26 segments on 3 servers and 25 on 2 servers in Bad state.
  @mayanks: It is running out of direct memory. What's the jvm configs you are using and also what how much memory available on the instance running Pinot server?
  @bagi.priyank: -Xms4G -Xmx16G -Dpinot.admin.system.exit=false
  @mayanks: How much memory available on the instance @bagi.priyank
  @bagi.priyank: 30.5 G. sorry was grabbing the info
  @npawar: i wonder if this is because the offheap.alloc is not set on the servers @mayanks?
  @bagi.priyank: i am using pinot version 0.9.0 on java 11
  @npawar: not sure if it is default false or true
  @mayanks: Are all 128 partitions on a single server @bagi.priyank?
  @bagi.priyank: 5 servers
  @bagi.priyank: ``` "realtime.segment.flush.threshold.rows": "300000000", "realtime.segment.flush.threshold.time": "24h", "realtime.segment.flush.desired.size": "300M",```
  @mayanks: @npawar yes we should switch to off heap if not already there.
  @mayanks: `realtime.segment.flush.threshold.rows` is too high
  @mayanks: Set it to 0, so desired size kicks in
  @bagi.priyank: i see, ok
  @bagi.priyank: do you think i should try setting `XX:MaxDirectMemorySize` ?
  @npawar: i recommend setting this on servers instead
  @bagi.priyank: ok. thanks guys!
  @bagi.priyank: i am not seeing oom anymore. however now there are ~200 segments per server in an hour. is that a decent / high / low number?
  @npawar: That will keep reducing. When you enable segment size based thresholds, the first segment will be 100k rows, and then the number of rows in segments will keep getting bigger and bigger, in order to reach 300M size
  @bagi.priyank: got it. i do see 100k rows right now.
  @npawar: If you want, we can increase the initial number from 100k to something more
  @bagi.priyank: i am still trying to understand the impact of it. more segments means more segments to scan for a query right? so trying to understand if what i have looks like a decent place to start or if there is anything else i can optimize.
  @npawar: 200 per server per hour is certainly high. Might be worth setting the `realtime.segment.flush.autotune.initialRows` to 1000000 (or something proportional to make your segment size bigger).
  @bagi.priyank: got it. what is the sweet spot in your experience?
  @npawar: 300-500M segments is a good number. So just make it (300*100k)/currentlySeenSegmentSIze
  @npawar: maybe a little lesser than that
  @npawar: anyway this is all to simply avoid initial surge of segments. I 2-3 iterations, the segment size should stabilise on its own
  @bagi.priyank: got it. thank you once again!
@momento.corto: @momento.corto has joined the channel
@xiaoman: @xiaoman has joined the channel
--------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@pinot.apache.org For additional commands, e-mail: dev-h...@pinot.apache.org

Reply via email to