#general


@sebastian.schulz: @sebastian.schulz has joined the channel
@xingxingyang: @xingxingyang has joined the channel
@linhan: @linhan has joined the channel
@changtongsu: @changtongsu has joined the channel
@piyush.chauhan: Can we use Java Spring Boot data JPA with Pinot? I am not able to find any resources around it.
  @piyush.chauhan: With this I want to query that Pinot DB without writing manual sql queries.
  @mayanks: Not at the moment, but I can see value in having some sort of client side model/sql builder, do you mind opening a GH issue?
@shyam: @shyam has joined the channel
@prashant.korade: @prashant.korade has joined the channel

#random


@sebastian.schulz: @sebastian.schulz has joined the channel
@xingxingyang: @xingxingyang has joined the channel
@linhan: @linhan has joined the channel
@changtongsu: @changtongsu has joined the channel
@shyam: @shyam has joined the channel
@prashant.korade: @prashant.korade has joined the channel

#troubleshooting


@sebastian.schulz: @sebastian.schulz has joined the channel
@xingxingyang: @xingxingyang has joined the channel
@linhan: @linhan has joined the channel
@changtongsu: @changtongsu has joined the channel
@lrhadoop143: Hi Team, i'm trying to connect minio through apache pinot when i'm trying to run yml files it is failing with this error. "*ERROR [LaunchDataIngestionJobCommand] [main] Got exception to kick off standalone data ingestion job -* *java.lang.RuntimeException: software.amazon.awssdk.core.exception.SdkClientException: Configured region (localhost%3A9010) resulted in an invalid URI: % Valid region examples: "* i started controller and server with controller.conf and server.conf and my yml file is *"executionFrameworkSpec:* *name: 'standalone'* *segmentGenerationJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentGenerationJobRunner'* *segmentTarPushJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentTarPushJobRunner'* *segmentUriPushJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentUriPushJobRunner'* *jobType: SegmentCreationAndUriPush* *inputDirURI: ''* *includeFileNamePattern: 'glob:**/*.csv'* *outputDirURI: ''* *overwriteOutput: true* *pinotFSSpecs:* *- scheme: http* *className: org.apache.pinot.plugin.filesystem.S3PinotFS* *configs:* *region: 'localhost:9010'* *recordReaderSpec:* *dataFormat: 'csv'* *className: 'org.apache.pinot.plugin.inputformat.csv.CSVRecordReader'* *configClassName: 'org.apache.pinot.plugin.inputformat.csv.CSVRecordReaderConfig'* *tableSpec:* *tableName: 'transcript'* *schemaURI: ''* *tableConfigURI: ''* *pinotClusterSpecs:* *- controllerURI: ''*"
  @mark.needham: are you wanting to store stuff in S3? If so then this is an invalid AWS region: ```region: 'localhost:9010'```
  @lrhadoop143: actually i'm using minio not s3 .
  @mark.needham: got it. So I think we need to pass in `endpoint` () Can you try this: ```pinotFSSpecs:   - scheme: http     className: org.apache.pinot.plugin.filesystem.S3PinotFS     configs:       region: 'us-east-1' endpoint: ''```
  @lrhadoop143: Now it's throwing connection refused error.ERROR [LaunchDataIngestionJobCommand] [main] Got exception to kick off standalone data ingestion job - software.amazon.awssdk.core.exception.SdkClientException: Unable to execute HTTP request: Connect to localhost:9010 [localhost/127.0.0.1] failed: Connection refused (Connection refused)
  @mark.needham: is it the right endpoint? I was trying to guess what it should be from following this tutorial -
  @lrhadoop143: Hi Mark, i'm using minio public account to access data from minio i'm facing this issue "*ERROR [LaunchDataIngestionJobCommand] [main] Got exception to kick off standalone data ingestion job -* *software.amazon.awssdk.services.s3.model.S3Exception: The AWS Access Key Id you provided does not exist in our records. (Service: S3, Status Code: 403, Request ID: M4NGSS6PPRBP2DJ6, Extended Request ID: /0wE5yHVkh0LtoL+Sy6SetfcRuHSUdt4ZRATk3k6vyzuSaAFoLf3Szb9xOrkXvi44e1cKNdGIjA=)"*
  @mark.needham: are you passing in valid `accessKey` and `secretKey` under the configs section?
  @lrhadoop143: Yeah Mark i passed that too now it is working but at the end *"java.lang.RuntimeException: Failed to read from Schema URI - '" facing this issue*
  @mark.needham: ah cool ok, making some progress
  @mark.needham: do you have `transcript_minio` somewhere? I can't see it...
  @lrhadoop143: I created in pinot
  @lrhadoop143: I changed that table name and schema name now we are using transcript_minio only
  @mark.needham: oh I see
  @mark.needham: can you paste the ingestion job yaml again (xxx out your AWS credentials though)
  @lrhadoop143: i passed credentials through command
  @lrhadoop143: executionFrameworkSpec: name: 'standalone' segmentGenerationJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentGenerationJobRunner' segmentTarPushJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentTarPushJobRunner' segmentUriPushJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentUriPushJobRunner' jobType: SegmentCreationAndUriPush inputDirURI: '' includeFileNamePattern: 'glob:**/*.csv' outputDirURI: '' overwriteOutput: true pinotFSSpecs: - scheme: s3 className: org.apache.pinot.plugin.filesystem.S3PinotFS configs: region: 'us-east-1' endpoint: '' recordReaderSpec: dataFormat: 'csv' className: 'org.apache.pinot.plugin.inputformat.csv.CSVRecordReader' configClassName: 'org.apache.pinot.plugin.inputformat.csv.CSVRecordReaderConfig' tableSpec: tableName: 'transcript_minio' schemaURI: '' tableConfigURI: '' pinotClusterSpecs: - controllerURI: ''
  @mark.needham: if you go to do you see anything?
  @lrhadoop143: Yeah I can see schema
  @mark.needham: hmmm, I wonder why the ingestion job can't access it then
  @mark.needham: Try this one: ```executionFrameworkSpec:   name: 'standalone'   segmentGenerationJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentGenerationJobRunner'   segmentTarPushJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentTarPushJobRunner'   segmentUriPushJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentUriPushJobRunner' jobType:  SegmentCreationAndUriPush inputDirURI: '' includeFileNamePattern: 'glob:**/*.csv' outputDirURI: '' overwriteOutput: true pinotFSSpecs:   - scheme: s3     className: org.apache.pinot.plugin.filesystem.S3PinotFS     configs:       region: 'us-east-1'       endpoint: '' recordReaderSpec:   dataFormat: 'csv'   className: 'org.apache.pinot.plugin.inputformat.csv.CSVRecordReader'   configClassName: 'org.apache.pinot.plugin.inputformat.csv.CSVRecordReaderConfig' tableSpec:   tableName: 'transcript_minio' pinotClusterSpecs:   - controllerURI: ''```
  @mark.needham: in theory that's exactly the same as it will infer the schema/table config URIs
  @lrhadoop143: ok let me try
  @lrhadoop143: no use mark.same error.
  @mark.needham: can you paste the whole stack trace so I can see which line it's failing on
  @lrhadoop143: sudo docker run --rm -ti \ > --name pinot-ingestion-job \ > -v //tmp/pinot-quick-start:/tmp/pinot-quick-start \ > --network=pinot-demo \ > --env AWS_ACCESS_KEY_ID="Q3AM3UQ867SPQQA43P2F" \ > --env AWS_SECRET_ACCESS_KEY="zuf+tfteSlswRu7BJ86wekitnifILbZam1KYY3TG" \ > --mount type=bind,source=/tmp/pinot-s3-docker,target=/tmp \ > apachepinot/pinot:latest LaunchDataIngestionJob \ > -jobSpecFile /tmp/pinot-quick-start/transcript-s3.yml SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/opt/pinot/lib/pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/pinot/plugins/pinot-input-format/pinot-parquet/pinot-parquet-0.10.0-SNAPSHOT-shaded.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/pinot/plugins/pinot-environment/pinot-azure/pinot-azure-0.10.0-SNAPSHOT-shaded.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/pinot/plugins/pinot-metrics/pinot-yammer/pinot-yammer-0.10.0-SNAPSHOT-shaded.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/pinot/plugins/pinot-metrics/pinot-dropwizard/pinot-dropwizard-0.10.0-SNAPSHOT-shaded.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/pinot/plugins/pinot-file-system/pinot-s3/pinot-s3-0.10.0-SNAPSHOT-shaded.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See for an explanation. SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory] WARNING: sun.reflect.Reflection.getCallerClass is not supported. This will impact performance. WARNING: An illegal reflective access operation has occurred WARNING: Illegal reflective access by org.codehaus.groovy.reflection.CachedClass (file:/opt/pinot/lib/pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar) to method java.lang.Object.finalize() WARNING: Please consider reporting this to the maintainers of org.codehaus.groovy.reflection.CachedClass WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations WARNING: All illegal access operations will be denied in a future release 2021/12/01 10:23:46.227 ERROR [LaunchDataIngestionJobCommand] [main] Got exception to kick off standalone data ingestion job - java.lang.RuntimeException: Failed to read from Schema URI - '' at org.apache.pinot.common.segment.generation.SegmentGenerationUtils.getSchema(SegmentGenerationUtils.java:89) ~[pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-b7b82584b3abc3e4bd679058a69604db828a6d25] at org.apache.pinot.plugin.ingestion.batch.standalone.SegmentGenerationJobRunner.init(SegmentGenerationJobRunner.java:144) ~[pinot-batch-ingestion-standalone-0.10.0-SNAPSHOT-shaded.jar:0.10.0-SNAPSHOT-b7b82584b3abc3e4bd679058a69604db828a6d25] at org.apache.pinot.spi.ingestion.batch.IngestionJobLauncher.kickoffIngestionJob(IngestionJobLauncher.java:144) ~[pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-b7b82584b3abc3e4bd679058a69604db828a6d25] at org.apache.pinot.spi.ingestion.batch.IngestionJobLauncher.runIngestionJob(IngestionJobLauncher.java:121) ~[pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-b7b82584b3abc3e4bd679058a69604db828a6d25] at org.apache.pinot.tools.admin.command.LaunchDataIngestionJobCommand.execute(LaunchDataIngestionJobCommand.java:120) [pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-b7b82584b3abc3e4bd679058a69604db828a6d25] at org.apache.pinot.tools.Command.call(Command.java:33) [pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-b7b82584b3abc3e4bd679058a69604db828a6d25] at org.apache.pinot.tools.Command.call(Command.java:29) [pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-b7b82584b3abc3e4bd679058a69604db828a6d25] at picocli.CommandLine.executeUserObject(CommandLine.java:1953) [pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-b7b82584b3abc3e4bd679058a69604db828a6d25] at picocli.CommandLine.access$1300(CommandLine.java:145) [pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-b7b82584b3abc3e4bd679058a69604db828a6d25] at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2352) [pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-b7b82584b3abc3e4bd679058a69604db828a6d25] at picocli.CommandLine$RunLast.handle(CommandLine.java:2346) [pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-b7b82584b3abc3e4bd679058a69604db828a6d25] at picocli.CommandLine$RunLast.handle(CommandLine.java:2311) [pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-b7b82584b3abc3e4bd679058a69604db828a6d25] at picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:2179) [pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-b7b82584b3abc3e4bd679058a69604db828a6d25] at picocli.CommandLine.execute(CommandLine.java:2078) [pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-b7b82584b3abc3e4bd679058a69604db828a6d25] at org.apache.pinot.tools.admin.PinotAdministrator.execute(PinotAdministrator.java:161) [pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-b7b82584b3abc3e4bd679058a69604db828a6d25] at org.apache.pinot.tools.admin.PinotAdministrator.main(PinotAdministrator.java:192) [pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-b7b82584b3abc3e4bd679058a69604db828a6d25] Caused by: java.net.ConnectException: Connection refused (Connection refused) at .PlainSocketImpl.socketConnect(Native Method) ~[?:?] at .AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:399) ~[?:?] at .AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:242) ~[?:?] at .AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:224) ~[?:?] at .Socket.connect(Socket.java:609) ~[?:?] at .Socket.connect(Socket.java:558) ~[?:?] at .NetworkClient.doConnect(NetworkClient.java:182) ~[?:?] at .(HttpClient.java:474) ~[?:?] at .(HttpClient.java:569) ~[?:?] at ..<init>(HttpClient.java:242) ~[?:?] at .(HttpClient.java:341) ~[?:?] at .(HttpClient.java:362) ~[?:?] at .(HttpURLConnection.java:1253) ~[?:?] at .(HttpURLConnection.java:1187) ~[?:?] at .(HttpURLConnection.java:1081) ~[?:?] at .(HttpURLConnection.java:1015) ~[?:?] at .(HttpURLConnection.java:1592) ~[?:?] at .(HttpURLConnection.java:1520) ~[?:?] at org.apache.pinot.common.segment.generation.SegmentGenerationUtils.fetchUrl(SegmentGenerationUtils.java:233) ~[pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-b7b82584b3abc3e4bd679058a69604db828a6d25] at org.apache.pinot.common.segment.generation.SegmentGenerationUtils.getSchema(SegmentGenerationUtils.java:87) ~[pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-b7b82584b3abc3e4bd679058a69604db828a6d25] ... 15 more java.lang.RuntimeException: Failed to read from Schema URI - '' at org.apache.pinot.common.segment.generation.SegmentGenerationUtils.getSchema(SegmentGenerationUtils.java:89) at org.apache.pinot.plugin.ingestion.batch.standalone.SegmentGenerationJobRunner.init(SegmentGenerationJobRunner.java:144) at org.apache.pinot.spi.ingestion.batch.IngestionJobLauncher.kickoffIngestionJob(IngestionJobLauncher.java:144) at org.apache.pinot.spi.ingestion.batch.IngestionJobLauncher.runIngestionJob(IngestionJobLauncher.java:121) at org.apache.pinot.tools.admin.command.LaunchDataIngestionJobCommand.execute(LaunchDataIngestionJobCommand.java:120) at org.apache.pinot.tools.Command.call(Command.java:33) at org.apache.pinot.tools.Command.call(Command.java:29) at picocli.CommandLine.executeUserObject(CommandLine.java:1953) at picocli.CommandLine.access$1300(CommandLine.java:145) at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2352) at picocli.CommandLine$RunLast.handle(CommandLine.java:2346) at picocli.CommandLine$RunLast.handle(CommandLine.java:2311) at picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:2179) at picocli.CommandLine.execute(CommandLine.java:2078) at org.apache.pinot.tools.admin.PinotAdministrator.execute(PinotAdministrator.java:161) at org.apache.pinot.tools.admin.PinotAdministrator.main(PinotAdministrator.java:192) Caused by: java.net.ConnectException: Connection refused (Connection refused) at java.base/java.net.PlainSocketImpl.socketConnect(Native Method) at java.base/java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:399) at java.base/java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:242) at java.base/java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:224) at java.base/java.net.Socket.connect(Socket.java:609) at java.base/java.net.Socket.connect(Socket.java:558) at java.base/sun.net.NetworkClient.doConnect(NetworkClient.java:182) at java.base/sun.net.(HttpClient.java:474) at java.base/sun.net.(HttpClient.java:569) at java.base/sun.net..<init>(HttpClient.java:242) at java.base/sun.net.(HttpClient.java:341) at java.base/sun.net.(HttpClient.java:362) at java.base/sun.net.(HttpURLConnection.java:1253) at java.base/sun.net.(HttpURLConnection.java:1187) at java.base/sun.net.(HttpURLConnection.java:1081) at java.base/sun.net.(HttpURLConnection.java:1015) at java.base/sun.net.(HttpURLConnection.java:1592) at java.base/sun.net.(HttpURLConnection.java:1520) at org.apache.pinot.common.segment.generation.SegmentGenerationUtils.fetchUrl(SegmentGenerationUtils.java:233) at org.apache.pinot.common.segment.generation.SegmentGenerationUtils.getSchema(SegmentGenerationUtils.java:87) ... 15 more
  @lrhadoop143: Caused by: java.io.IOException: software.amazon.awssdk.core.exception.SdkClientException: Unable to execute HTTP request: Connect to [] failed: connect timed out
  @mark.needham: so is the problem reading from or it's reading the schema?
  @lrhadoop143: Yeah
  @mark.needham: which one though?
  @lrhadoop143: Reading from minio
  @mark.needham: oh I see - is minio not accessible from inside that Docker container or something?
  @lrhadoop143: It's giving timeout exception
  @mark.needham: @lrhadoop143 I made a little recipe on a sample dataset and I'm able to have the `tar.gz` file written to MinIO. I'm not entirely sure what I have different to you though -
@syedakram93: hi, is there any way to set query timeout parameter from jdbc/java-client? if query is taking more than 10sec
  @alihaydar.atil: @syedakram93 you can set query timeout in your table configuration file with timeoutMs property.
  @syedakram93: I'm using Java client to query too. Is that possible to set there too?
  @npawar: you can add it to the end of your query like this
@shyam: @shyam has joined the channel
@prashant.korade: @prashant.korade has joined the channel
@mapshen: When running the realtimeProvisioningHelper, we got a bunch of NAs. Any idea on how to troubleshoot this? ```RealtimeProvisioningHelper -tableConfigFile <tableConfig> -numPartitions 1 -pushFrequency null -numHosts 1,2,3,4 -numHours 1,2,3,4,56,12,18,24 -sampleCompletedSegmentDir <path-to-segment> -ingestionRate 1000 -maxUsableHostMemory 5120G -retentionHours 1 Note: * Table retention and push frequency ignored for determining retentionHours since it is specified in command * See Memory used per host (Active/Mapped) numHosts --> 1 |2 |3 |4 | numHours 1 --------> 8.1G/295.67G |4.05G/147.83G |4.05G/147.83G |4.05G/147.83G | 2 --------> NA |NA |NA |NA | 3 --------> NA |NA |NA |NA | 4 --------> NA |NA |NA |NA | 12 --------> NA |NA |NA |NA | 18 --------> NA |NA |NA |NA | 24 --------> NA |NA |NA |NA | 56 --------> NA |NA |NA |NA | Optimal segment size numHosts --> 1 |2 |3 |4 | numHours 1 --------> 1.51G |1.51G |1.51G |1.51G | 2 --------> NA |NA |NA |NA | 3 --------> NA |NA |NA |NA | 4 --------> NA |NA |NA |NA | 12 --------> NA |NA |NA |NA | 18 --------> NA |NA |NA |NA | 24 --------> NA |NA |NA |NA | 56 --------> NA |NA |NA |NA | Consuming memory numHosts --> 1 |2 |3 |4 | numHours 1 --------> 8.1G |4.05G |4.05G |4.05G | 2 --------> NA |NA |NA |NA | 3 --------> NA |NA |NA |NA | 4 --------> NA |NA |NA |NA | 12 --------> NA |NA |NA |NA | 18 --------> NA |NA |NA |NA | 24 --------> NA |NA |NA |NA | 56 --------> NA |NA |NA |NA | Total number of segments queried per host (for all partitions) numHosts --> 1 |2 |3 |4 | numHours 1 --------> 2 |1 |1 |1 | 2 --------> NA |NA |NA |NA | 3 --------> NA |NA |NA |NA | 4 --------> NA |NA |NA |NA | 12 --------> NA |NA |NA |NA | 18 --------> NA |NA |NA |NA | 24 --------> NA |NA |NA |NA | 56 --------> NA |NA |NA |NA |```

#pinot-s3


@derek.p.moore: @derek.p.moore has joined the channel

#pinot-k8s-operator


@derek.p.moore: @derek.p.moore has joined the channel

#pinot-dev


@sebastian.schulz: @sebastian.schulz has joined the channel

#community


@derek.p.moore: @derek.p.moore has joined the channel

#presto-pinot-connector


@derek.p.moore: @derek.p.moore has joined the channel

#getting-started


@shyam: @shyam has joined the channel
@luisfernandez: has anyone tried to move the segments folder if you use google to bigquery? to do big data computations that may not be possible in pinot?

#releases


@sebastian.schulz: @sebastian.schulz has joined the channel

#flink-pinot-connector


@derek.p.moore: @derek.p.moore has joined the channel
--------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@pinot.apache.org For additional commands, e-mail: dev-h...@pinot.apache.org

Reply via email to