#general
@khushbu.agarwal: @khushbu.agarwal has joined the channel
@sagar.khangan: @sagar.khangan has joined the channel
@sagar.khangan: Hi, I have setup pinot on EKS, how can I start with batch upload from s3 ? how can I set configs,? create tables ? I have started the incubator .
@fx19880617: you can check this doc for the corresponding configs to set in the `values.yml` file:
@fx19880617:
@bowlesns: I’m fairly new to Pinot, but I just did something similar in GCP so this should be the general direction: 1. Add s3 plugin and configuration from here:
@bowlesns: If anyone has an easier way to do 5 I’m all ears, but didn’t see an endpoint in the swagger docs to upload a job spec file for processing on a specific worker.
@fx19880617: Thanks Nick! For 5, we recently add a new ImportData command: ```bin/pinot-admin.sh ImportData -dataFilePath file:/Users/xiangfu/temp/github-data/part-00006-493ff49d-d946-4437-9e37-523b90c3d96f-c000.gz.parquet -format parquet -table githubEvents```
@bowlesns: Also my pods kept dying because they would fill up the `tmp` directory, so to get around this I modified the helm chart’s statefulset and added a PV mount since my nodes didn’t have enough space.
@fx19880617: Thanks for reporting this, are you be able to get what’s inside the tmp directory?
@bowlesns: Thanks Xiang!
@bowlesns: Yes it’s downloading the files from my bucket in there. I have a lot so it fills up, node gets diskPressure, and the pod gets evicted.
@bowlesns: Also I wanted to help with some edits to the documentation but says it moved. How can I contribute?
@fx19880617: Pinot docs are at this repo:
@fx19880617: for the temp directory, I will add an option to allow configurable temp directory for segment generation
@bowlesns: Awesome thank you so much. That’ll save me from having to make a custom helm chart.
@fx19880617: yes, i know someone tried to mount a big pvc to /tmp :joy:
@bowlesns: guilty as charged ¯\_(ツ)_/¯
@kelly.revenaugh: Hey all —
@aaron: @aaron has joined the channel
@bowlesns: @bowlesns has joined the channel
@ysim: @ysim has joined the channel
#random
@khushbu.agarwal: @khushbu.agarwal has joined the channel
@sagar.khangan: @sagar.khangan has joined the channel
@aaron: @aaron has joined the channel
@bowlesns: @bowlesns has joined the channel
@ysim: @ysim has joined the channel
#troubleshooting
@khushbu.agarwal: @khushbu.agarwal has joined the channel
@sagar.khangan: @sagar.khangan has joined the channel
@devashish: Hi Team, What is the recommended way of updating table schema. I used the following job to create my table ```apiVersion: batch/v1 kind: Job metadata: name: request-realtime-table-creation namespace: data2 spec: template: spec: containers: - name: request-realtime-table-json image: apachepinot/pinot:latest args: [ "AddTable", "-schemaFile", "/var/pinot/examples/request_schema.json", "-tableConfigFile", "/var/pinot/examples/request_realtime_table_config.json", "-controllerHost", "pinot2-controller", "-controllerPort", "9000", "-exec" ] env: - name: JAVA_OPTS value: "-Xms4G -Xmx4G -Dpinot.admin.system.exit=true" volumeMounts: - name: examples mountPath: /var/pinot/examples restartPolicy: OnFailure volumes: - name: examples configMap: name: pinot-table backoffLimit: 100```
@devashish: Currently I have updated the schema using the controller portal. I wanted to know if there is a way of doing this k8s natively. Also the schema change didnt reflect in the console until I did the complete segment reload. Is this a mandatory step or there is some work around to have simple schema updates like column addition as a O(1) operation.
@fx19880617: you can also use controller swagger api to update schema
@npawar: @devashish adding columns to a schema necessarily needs a reload :
@npawar: that’s how the changes trickle to the segments
@devashish: Alright. On a different topic, I have created a table with ```"dateTimeFieldSpecs": [ { "name": "TimeStamp", "dataType": "LONG", "format": "1:SECONDS:EPOCH", "granularity": "1:HOURS" } ]``` According to the documentation the granularity is used for bucketing. The table contains epoch values, so where is the bucketing used?
@npawar: granularity is not used anywhere atm. It will be used in the future for segment merge/rollups
@devashish: Can it be done with a similar job with UpdateTable as args along with updated schema?
@aaron: @aaron has joined the channel
@aaron: Hi! I'm trying to set up Pinot for the first time (as a cluster) and am starting to set up S3 following the steps in
@fx19880617: is there more stacktrace from server side ?
@aaron: Sure, I'll paste more
@fx19880617: Is it during server startup or when you upload data to the table
@aaron: During server startup
@aaron: ```2021/02/10 12:22:25.506 ERROR [PluginManager] [main] Failed to load plugin [pinot-s3] from dir [/<redacted>/apache-pinot-incubating-0.6.0-bin/plugins/pinot-file-system/pinot-s3] java.lang.IllegalArgumentException: object is not an instance of declaring class at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:?] at jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:?] at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:?] at java.lang.reflect.Method.invoke(Method.java:566) ~[?:?] at org.apache.pinot.spi.plugin.PluginClassLoader.<init>(PluginClassLoader.java:50) ~[pinot-all-0.6.0-jar-with-dependencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21] at org.apache.pinot.spi.plugin.PluginManager.createClassLoader(PluginManager.java:171) ~[pinot-all-0.6.0-jar-with-dependencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21] at org.apache.pinot.spi.plugin.PluginManager.load(PluginManager.java:162) ~[pinot-all-0.6.0-jar-with-dependencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21] at org.apache.pinot.spi.plugin.PluginManager.init(PluginManager.java:137) [pinot-all-0.6.0-jar-with-depe ndencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21] at org.apache.pinot.spi.plugin.PluginManager.init(PluginManager.java:103) [pinot-all-0.6.0-jar-with-depe ndencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21] at org.apache.pinot.spi.plugin.PluginManager.<init>(PluginManager.java:84) [pinot-all-0.6.0-jar-with-dep endencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21] at org.apache.pinot.spi.plugin.PluginManager.<clinit>(PluginManager.java:46) [pinot-all-0.6.0-jar-with-d ependencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21] at org.apache.pinot.tools.admin.PinotAdministrator.main(PinotAdministrator.java:164) [pinot-all-0.6.0-ja r-with-dependencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21] ```
@aaron: I see a report of a similar issue with GCS here:
@fx19880617: ic, are you using the docker image or the baremetal
@fx19880617: with bin
@aaron: Baremetal
@fx19880617: can you try with java opts to only include the plugins you are using: `-Dplugins.include=pinot-s3,pinot-parquet`
@fx19880617: something like this
@fx19880617: ic
@fx19880617: which java version are you using ?
@aaron: Yeah, I'm already doing that with java opts
@aaron: java 11
@fx19880617: ok
@aaron: openjdk version "11.0.9.1" 2020-11-04
@fx19880617: hmm
@fx19880617: I think for java11, we are trying to include all the plugins into classpath as some classloader apis are deprecated from java 8
@fx19880617: if that doesn’t work just delete the un-used plugins from pinot-bin directory
@fx19880617: like pinot-gcs
@aaron: But I think this error message is specifically about pinot-s3
@aaron: `Failed to load plugin [pinot-s3] from dir [/scratch/aaron/p rojects/pinot/apache-pinot-incubating-0.6.0-bin/plugins/pinot-file-system/pinot-s3]`
@aaron: Ok, I deleted every plugin except for pinot-s3, and I still see the same error message.
@fx19880617: ok
@fx19880617: hmm
@fx19880617: it might be issue of jdk
@fx19880617: i will check this jdk 11.0.9.1?
@aaron: Is pinot not compatible with jdk11?
@aaron: Thanks
@fx19880617: it should
@aaron: If it helps: ```$ java -version openjdk version "11.0.9.1" 2020-11-04 OpenJDK Runtime Environment (build 11.0.9.1+1-post-Debian-1deb10u2) OpenJDK 64-Bit Server VM (build 11.0.9.1+1-post-Debian-1deb10u2, mixed mode, sharing)```
@tamas.nadudvari: We also ran into this error with the Java 11 docker images. It might be relevant that the component starts with this warning: ```WARNING: sun.reflect.Reflection.getCallerClass is not supported. This will impact performance. WARNING: An illegal reflective access operation has occurred WARNING: Illegal reflective access by org.apache.pinot.spi.plugin.PluginClassLoader (file:/opt/pinot/lib/pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar) to method java.net.URLClassLoader.addURL(java.net.URL) WARNING: Please consider reporting this to the maintainers of org.apache.pinot.spi.plugin.PluginClassLoader WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations WARNING: All illegal access operations will be denied in a future release``` We decided to fall back to a jdk8-based docker image for now.
@fx19880617: sure, I will take a look, one thing you can try is to build pinot from src code with your current java version
@fx19880617: right for java 11+ need to add `--add-exports java.base/jdk.internal.ref=ALL-UNNAMED` into JAVA_OPTS
@fx19880617: it’s different apis for class loader
@fx19880617: those warnings are related to this behavior @tamas.nadudvari just fyi
@aaron: Is Java 11 actually supported or should I downgrade to Java 8?
@fx19880617: it is supported. have you tried adding `-add-exports java.base/jdk.internal.ref=ALL-UNNAMED`
@aaron: I tried adding that
@aaron: I still see the same error, and I also see the "reflective access" warnings still
@fx19880617: ic
@fx19880617: let me check then
@bowlesns: @bowlesns has joined the channel
@bowlesns: Hey team thanks for setting this slack up! Would appreciate any help on this: I’m trying to do a multi line groovy script like this in a query in the Pinot Query Console: ```""" def value = 'blah' return value """``` using this syntax: ```select groovy('{"returnType":"STRING","isSingleValue":true}', <GROOVY MULTI LINE HERE>, my_variable) as new_variable from table``` I have tried to cancel out the quotes, use single, cancel any newlines, and cannot figure out how to get this to work. Any ideas?
@npawar: ```select groovy( '{"returnType":"DOUBLE","isSingleValue":true}' 'def sumSales=0; arg0.eachWithIndex{item, index-> if (item != "mug") {sumSales = sumSales + arg1[index]}}; return sumSales' , p1, p2) from fooTable limit 10``` here’s a multi line groovy script that used to work for me
@bowlesns: Tried in that format and I get this every time
@bowlesns: ```select groovy( '{"returnType":"STRING","isSingleValue":true}', 'def var=1; return var') as myvar from table limit 10``` Running this returns the same result of the 200 error code.
@npawar: since you’ve specified return type STRING, you’d have to use `def var="1"`
@npawar: this works for me `select groovy('{"returnType":"STRING","isSingleValue":true}','def var="1"; return var') from foo limit 10`
@bowlesns: Sorry that was a bad copy paste. I had INT originally. If I paste what you put in and change the table it runs, but if I paste mine it doesn’t…guessing something is happening with the characters when I’m pasting from my text editor
@bowlesns: This works
@bowlesns: This doesn’t
@bowlesns: all I did for the second one was hit enter to put it on the next line.
@bowlesns: Using triple quotes doesn’t work. Cancelling the newline doesn’t work
@bowlesns: I’ve also tried iterations of using triple single quotes, triple gstring quotes, can’t get anything to work
@npawar: why does it need to be on the next line?
@bowlesns: readability, i’ve got some conditionals and other stuff in there so it won’t be pretty on one line :slightly_smiling_face:
@bowlesns: I’m guessing there’s got to be a way to cancel that newline out if that’s causing the issue. Thanks for your help by the way!
@ysim: @ysim has joined the channel
#docs
@bowlesns: @bowlesns has joined the channel
#pinot-dev
@khushbu.agarwal: @khushbu.agarwal has joined the channel
#presto-pinot-connector
@bowlesns: @bowlesns has joined the channel
#discuss-validation
@chinmay.cerebro: @mayanks mind reviewing before we commit ?
@mayanks: Will do today
@chinmay.cerebro: thank you
#pinot-perf-tuning
@bowlesns: @bowlesns has joined the channel
--------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
