#general
@gulshan.yadav: @gulshan.yadav has joined the channel
@daniel: @here was wondering whether you might be able to point out to some rule-of-thumb compression benchmarks using Pinot? Cheers
@daniel: I'm like catching Snappy, runlength and so, but do you care about this, or are they like under-the-hood defaults?
@g.kishore: it depends on the input data format • it its row format like csv, json, avro, proto you can see anywhere between 3x to 10x compression • If its columnar like orc/parquet then its you dont see a lot of compression - 0.9x to 1.1x
@ranabanerji: @ranabanerji has joined the channel
@kha.nguyen: @kha.nguyen has joined the channel
@mayanks: Congratulations to the Data Sketches team (@leerho) on graduating to Apache Top level project. Glad to mention that Apache Pinot already provides support for Theta-Sketches based count-distinct (and set _expression_ evaluations):
@murat.ozcan: @murat.ozcan has joined the channel
@hansospina: @hansospina has joined the channel
@karinwolok1: :loudspeaker: Event updates! :loudspeaker: In case you missed this last time, @npawar (Pinot PMC & Committer) and Tim Berglund (Kafka / Confluent) will be presenting tomorrow:
@terrysv: @terrysv has joined the channel
@huangzhenqiu0825: @huangzhenqiu0825 has joined the channel
@tymm:
#random
@gulshan.yadav: @gulshan.yadav has joined the channel
@ranabanerji: @ranabanerji has joined the channel
@kha.nguyen: @kha.nguyen has joined the channel
@murat.ozcan: @murat.ozcan has joined the channel
@hansospina: @hansospina has joined the channel
@terrysv: @terrysv has joined the channel
@huangzhenqiu0825: @huangzhenqiu0825 has joined the channel
#troubleshooting
@gulshan.yadav: @gulshan.yadav has joined the channel
@ranabanerji: @ranabanerji has joined the channel
@kha.nguyen: @kha.nguyen has joined the channel
@kha.nguyen: Hi everyone. I'm currently trying to import some batch data to my Pinot cluster and I'm running into some issues with doing this. I have the latest version of Pinot (0.7.0) in a docker container, and I set everything up manually. I followed the docker version of this guide here:
@wrbriggs: @kha.nguyen With the `APPEND` push type, even with an offline table, I am pretty sure a primary time column is mandatory. Your schema doesn’t define one.
@wrbriggs: and your table definition doesn’t contain a `timeColumnName`value either - however, the example you’re running is likely trying to push the schema first, and that’s where it’s failing - so you’re not even getting to the table creation or loading the batch CSV
@wrbriggs: According to the docs: ```The primary time column is used by Pinot, for maintaining the time boundary between offline and realtime data in a hybrid table and for retention management. A primary time column is mandatory if the table's push type is APPEND and optional if the push type is REFRESH.``` (see `DateTime`
@mailtobuchi: @mailtobuchi has left the channel
@murat.ozcan: @murat.ozcan has joined the channel
@hansospina: @hansospina has joined the channel
@terrysv: @terrysv has joined the channel
@huangzhenqiu0825: @huangzhenqiu0825 has joined the channel
#pinot-s3
@hansospina: @hansospina has joined the channel
#onboarding
@hansospina: @hansospina has joined the channel
#pinot-dev
@hansospina: @hansospina has joined the channel
@jlli: Hey @slack1 we recently found a bug in this PR (
@slack1: Hey Jack, thanks for the PR. Quick question - did you check with the most recent master? we just merged a related PR 2 days ago:
@jlli: Yes, I noticed that. But this time it’s on pinot-server. The PR you pointed out is for pinot-controller and pinot-broker.
@slack1: I could have done a better job with the PR description. we actually fixed the default behavior for server admin port too (see bottom change to ListenerConfigUtil.java)
@jlli: Cool, I’ve verified that the latest fix works. Thanks for pointing that out! I can close the current PR now.
#community
@terrysv: @terrysv has joined the channel
#announcements
@terrysv: @terrysv has joined the channel
#discuss-validation
@chinmay.cerebro: FYI: Opened a new PR for some missing validation. @mayanks might want to eyeball it - I dont think this will cause any issues in the LinkedIn integration tests - but doesn't hurt to verify
@chinmay.cerebro:
@chinmay.cerebro: lol - it just broke a bunch of integration tests :slightly_smiling_face:. Looks like some integration tests are specifying a range index on a non numeric column
#getting-started
@hansospina: @hansospina has joined the channel
--------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
